Re: [PATCH] i386: Avoid fma_chain for -march=alderlake and sapphirerapids.

2022-12-14 Thread Hongyu Wang via Gcc-patches
If there is no objection, I'm going to backport the m_SAPPHIRERAPIDS
and m_ALDERLAKE change to GCC 12.

Uros Bizjak via Gcc-patches  于2022年12月7日周三 15:11写道:
>
> On Wed, Dec 7, 2022 at 7:36 AM Hongyu Wang  wrote:
> >
> > For Alderlake there is similar issue like PR 81616, enable
> > avoid_fma256_chain will also benefit on Intel latest platforms
> > Alderlake and Sapphire Rapids.
> >
> > Bootstrapped/regtested on x86_64-pc-linux-gnu{-m32,}.
> >
> > Ok for master?
> >
> > gcc/ChangeLog:
> >
> > * config/i386/x86-tune.def (X86_TUNE_AVOID_256FMA_CHAINS): Add
> > m_SAPPHIRERAPIDS, m_ALDERLAKE and m_CORE_ATOM.
>
> OK.
>
> Thanks,
> Uros.
>
> > ---
> >  gcc/config/i386/x86-tune.def | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
> > index cd66f335113..db85de20bae 100644
> > --- a/gcc/config/i386/x86-tune.def
> > +++ b/gcc/config/i386/x86-tune.def
> > @@ -499,7 +499,8 @@ DEF_TUNE (X86_TUNE_AVOID_128FMA_CHAINS, 
> > "avoid_fma_chains", m_ZNVER)
> >
> >  /* X86_TUNE_AVOID_256FMA_CHAINS: Avoid creating loops with tight 256bit or
> > smaller FMA chain.  */
> > -DEF_TUNE (X86_TUNE_AVOID_256FMA_CHAINS, "avoid_fma256_chains", m_ZNVER2 | 
> > m_ZNVER3)
> > +DEF_TUNE (X86_TUNE_AVOID_256FMA_CHAINS, "avoid_fma256_chains", m_ZNVER2 | 
> > m_ZNVER3
> > + | m_ALDERLAKE | m_SAPPHIRERAPIDS | m_CORE_ATOM)
> >
> >  /* X86_TUNE_V2DF_REDUCTION_PREFER_PHADDPD: Prefer haddpd
> > for v2df vector reduction.  */
> > --
> > 2.18.1
> >


Re: [PATCH] [x86] x86: Don't add crtfastmath.o for -shared and add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-14 Thread Richard Biener via Gcc-patches
On Wed, Dec 14, 2022 at 3:21 AM liuhongt via Gcc-patches
 wrote:
>
> Don't add crtfastmath.o for -shared to avoid changing the MXCSR
> register when loading a shared library.  crtfastmath.o will be used
> only when building executables.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?

You reject negative -mdaz-ftz but wouldn't that be useful with
-Ofast -mno-daz-ftz since there's otherwise no way to avoid that?

Richard.

> gcc/ChangeLog:
>
> PR target/55522
> PR target/36821
> * config/i386/gnu-user-common.h (GNU_USER_TARGET_MATHFILE_SPEC):
> Link crtfastmath.o when -mdaz-ftz is specified, not link it
> when -shared is specified.
> * config/i386/i386.opt (mdaz-ftz): New option.
> * doc/invoke.texi (x86 options): Document mftz-daz.
> ---
>  gcc/config/i386/gnu-user-common.h |  2 +-
>  gcc/config/i386/i386.opt  |  4 
>  gcc/doc/invoke.texi   | 10 +-
>  3 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/i386/gnu-user-common.h 
> b/gcc/config/i386/gnu-user-common.h
> index cab9be2bfb7..02e4a2192a4 100644
> --- a/gcc/config/i386/gnu-user-common.h
> +++ b/gcc/config/i386/gnu-user-common.h
> @@ -47,7 +47,7 @@ along with GCC; see the file COPYING3.  If not see
>
>  /* Similar to standard GNU userspace, but adding -ffast-math support.  */
>  #define GNU_USER_TARGET_MATHFILE_SPEC \
> -  "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \
> +  
> "%{Ofast|ffast-math|funsafe-math-optimizations|mdaz-ftz:%{!shared:crtfastmath.o%s}}
>  \
> %{mpc32:crtprec32.o%s} \
> %{mpc64:crtprec64.o%s} \
> %{mpc80:crtprec80.o%s}"
> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
> index fb4e57ada7c..8fd222db857 100644
> --- a/gcc/config/i386/i386.opt
> +++ b/gcc/config/i386/i386.opt
> @@ -420,6 +420,10 @@ mpc80
>  Target RejectNegative
>  Set 80387 floating-point precision to 80-bit.
>
> +mdaz-ftz
> +Target RejectNegative
> +Set the FTZ and DAZ Flags.
> +
>  mpreferred-stack-boundary=
>  Target RejectNegative Joined UInteger Var(ix86_preferred_stack_boundary_arg)
>  Attempt to keep stack aligned to this power of 2.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index cb40b38b73a..670e3767fbd 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -1433,7 +1433,7 @@ See RS/6000 and PowerPC Options.
>  -m96bit-long-double  -mlong-double-64  -mlong-double-80  -mlong-double-128 
> @gol
>  -mregparm=@var{num}  -msseregparm @gol
>  -mveclibabi=@var{type}  -mvect8-ret-in-mem @gol
> --mpc32  -mpc64  -mpc80  -mstackrealign @gol
> +-mpc32  -mpc64  -mpc80  -mdaz-ftz -mstackrealign @gol
>  -momit-leaf-frame-pointer  -mno-red-zone  -mno-tls-direct-seg-refs @gol
>  -mcmodel=@var{code-model}  -mabi=@var{name}  -maddress-mode=@var{mode} @gol
>  -m32  -m64  -mx32  -m16  -miamcu  -mlarge-data-threshold=@var{num} @gol
> @@ -32752,6 +32752,14 @@ are enabled by default; routines in such libraries 
> could suffer significant
>  loss of accuracy, typically through so-called ``catastrophic cancellation'',
>  when this option is used to set the precision to less than extended 
> precision.
>
> +@item -mdaz-ftz
> +@opindex mdaz-ftz
> +
> +the flush-to-zero (FTZ) and denormals-are-zero (DAZ) flags in the MXCSR 
> register
> +are used to control floating-point calculations.SSE and AVX instructions
> +including scalar and vector instructions could benefit from enabling the FTZ
> +and DAZ flags when @option{-mdaz-ftz} is specified.

Maybe say that the MXCSR register is set at program start to achieve
this when the
flag is specified at _link_ time and say this switch is ignored when
-shared is specified?

> +
>  @item -mstackrealign
>  @opindex mstackrealign
>  Realign the stack at entry.  On the x86, the @option{-mstackrealign}
> --
> 2.27.0
>


[PATCH] RISC-V: Add testcases for VSETVL PASS

2022-12-14 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/rvv.exp: Adjust to enable tests for VSETVL PASS.
* gcc.target/riscv/rvv/vsetvl/dump-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-10.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-11.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-12.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-13.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-14.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-15.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-16.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-17.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-18.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-19.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-2.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-4.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-5.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-6.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-7.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-8.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-9.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-2.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-4.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-5.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-6.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-7.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-8.c: New test.

---
 gcc/testsuite/gcc.target/riscv/rvv/rvv.exp|   2 +
 .../gcc.target/riscv/rvv/vsetvl/dump-1.c  |  33 
 .../riscv/rvv/vsetvl/vlmax_single_block-1.c   | 154 ++
 .../riscv/rvv/vsetvl/vlmax_single_block-10.c  | 143 
 .../riscv/rvv/vsetvl/vlmax_single_block-11.c  |  34 
 .../riscv/rvv/vsetvl/vlmax_single_block-12.c  |  92 +++
 .../riscv/rvv/vsetvl/vlmax_single_block-13.c  |  89 ++
 .../riscv/rvv/vsetvl/vlmax_single_block-14.c  |  16 ++
 .../riscv/rvv/vsetvl/vlmax_single_block-15.c  |  42 +
 .../riscv/rvv/vsetvl/vlmax_single_block-16.c  | 147 +
 .../riscv/rvv/vsetvl/vlmax_single_block-17.c  |  32 
 .../riscv/rvv/vsetvl/vlmax_single_block-18.c  |  32 
 .../riscv/rvv/vsetvl/vlmax_single_block-19.c  | 105 
 .../riscv/rvv/vsetvl/vlmax_single_block-2.c   |  70 
 .../riscv/rvv/vsetvl/vlmax_single_block-3.c   |  70 
 .../riscv/rvv/vsetvl/vlmax_single_block-4.c   |  49 ++
 .../riscv/rvv/vsetvl/vlmax_single_block-5.c   |  49 ++
 .../riscv/rvv/vsetvl/vlmax_single_block-6.c   |  28 
 .../riscv/rvv/vsetvl/vlmax_single_block-7.c   |  28 
 .../riscv/rvv/vsetvl/vlmax_single_block-8.c   |  28 
 .../riscv/rvv/vsetvl/vlmax_single_block-9.c   | 147 +
 .../riscv/rvv/vsetvl/vlmax_single_vtype-1.c   |  86 ++
 .../riscv/rvv/vsetvl/vlmax_single_vtype-2.c   |  42 +
 .../riscv/rvv/vsetvl/vlmax_single_vtype-3.c   |  38 +
 .../riscv/rvv/vsetvl/vlmax_single_vtype-4.c   |  31 
 .../riscv/rvv/vsetvl/vlmax_single_vtype-5.c   |  31 
 .../riscv/rvv/vsetvl/vlmax_single_vtype-6.c   |  18 ++
 .../riscv/rvv/vsetvl/vlmax_single_vtype-7.c   |  18 ++
 .../riscv/rvv/vsetvl/vlmax_single_vtype-8.c   |  18 ++
 29 files changed, 1672 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/dump-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_single_block-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_single_block-10.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_single_block-11.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_single_block-12.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_single_block-13.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_single_block-14.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_single_block-15.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_single_block-16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_single_block-17.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_single_block-18.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_single_block-19.c
 create mode 100644 
gcc/

[PATCH] RISC-V: Add testcases for VSETVL PASS 2

2022-12-14 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/vlmax_phi-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-10.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-11.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-12.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-13.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-14.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-15.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-16.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-17.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-18.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-19.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-2.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-20.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-21.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-22.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-23.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-24.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-25.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-26.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-27.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-28.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-4.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-5.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-6.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-7.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-8.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-9.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-10.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-11.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-12.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-13.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-14.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-15.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-16.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-2.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-4.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-5.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-6.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-7.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-8.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-9.c: New test.

---
 .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-1.c |  37 +++
 .../riscv/rvv/vsetvl/vlmax_phi-10.c   |  37 +++
 .../riscv/rvv/vsetvl/vlmax_phi-11.c   |  37 +++
 .../riscv/rvv/vsetvl/vlmax_phi-12.c   |  37 +++
 .../riscv/rvv/vsetvl/vlmax_phi-13.c   |  37 +++
 .../riscv/rvv/vsetvl/vlmax_phi-14.c   | 217 
 .../riscv/rvv/vsetvl/vlmax_phi-15.c   |  40 +++
 .../riscv/rvv/vsetvl/vlmax_phi-16.c   |  40 +++
 .../riscv/rvv/vsetvl/vlmax_phi-17.c   |  40 +++
 .../riscv/rvv/vsetvl/vlmax_phi-18.c   |  40 +++
 .../riscv/rvv/vsetvl/vlmax_phi-19.c   |  40 +++
 .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-2.c |  37 +++
 .../riscv/rvv/vsetvl/vlmax_phi-20.c   |  40 +++
 .../riscv/rvv/vsetvl/vlmax_phi-21.c   |  40 +++
 .../riscv/rvv/vsetvl/vlmax_phi-22.c   |  40 +++
 .../riscv/rvv/vsetvl/vlmax_phi-23.c   |  40 +++
 .../riscv/rvv/vsetvl/vlmax_phi-24.c   |  40 +++
 .../riscv/rvv/vsetvl/vlmax_phi-25.c   |  40 +++
 .../riscv/rvv/vsetvl/vlmax_phi-26.c   |  40 +++
 .../riscv/rvv/vsetvl/vlmax_phi-27.c   |  40 +++
 .../riscv/rvv/vsetvl/vlmax_phi-28.c   | 237 ++
 .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-3.c |  37 +++
 .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-4.c |  37 +++
 .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-5.c |  37 +++
 .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-6.c |  37 +++
 .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-7.c |  37 +++
 .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-8.c |  37 +++
 .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-9.c |  37 +++
 .../riscv/rvv/vsetvl/vlmax_switch_vtype-1.c   |  26 ++
 .../riscv/rvv/vsetvl/vlmax_switch_vtype-10.c  |  47 
 .../riscv/rvv/vsetvl/vlmax_switch_vtype-11.c  |  55 
 .../riscv/rvv/vsetvl/vlmax_switch_vtype-12.c  |  55 
 .../riscv/rvv/vsetvl/vlmax_switch_vtype-13.c  |  17 ++
 .../riscv/rvv/vsetvl/vlmax_switch_vtype-14.c  |  39 +++
 .../riscv/rvv/vsetvl/vlmax_switch_vty

[PATCH] RISC-V: Add testcases for VSETVL PASS 3

2022-12-14 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-10.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-11.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-12.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-13.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-14.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-15.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-16.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-17.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-18.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-19.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-2.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-20.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-21.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-22.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-23.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-24.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-25.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-26.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-27.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-28.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-4.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-5.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-6.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-7.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-8.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-9.c: New test.

---
 .../riscv/rvv/vsetvl/vlmax_miss_default-1.c   |  32 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-10.c  |  32 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-11.c  |  32 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-12.c  |  32 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-13.c  |  32 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-14.c  | 189 ++
 .../riscv/rvv/vsetvl/vlmax_miss_default-15.c  |  38 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-16.c  |  38 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-17.c  |  38 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-18.c  |  38 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-19.c  |  38 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-2.c   |  32 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-20.c  |  38 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-21.c  |  38 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-22.c  |  38 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-23.c  |  38 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-24.c  |  38 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-25.c  |  38 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-26.c  |  38 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-27.c  |  38 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-28.c  | 231 ++
 .../riscv/rvv/vsetvl/vlmax_miss_default-3.c   |  32 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-4.c   |  32 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-5.c   |  32 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-6.c   |  32 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-7.c   |  32 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-8.c   |  32 +++
 .../riscv/rvv/vsetvl/vlmax_miss_default-9.c   |  32 +++
 28 files changed, 1330 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-1.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-10.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-11.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-12.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-13.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-14.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-15.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-17.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-18.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-19.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-2.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-20.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-21.c
 create mode 100644 
gcc/testsuite/g

Re: [PATCH] [x86] x86: Don't add crtfastmath.o for -shared and add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Dec 14, 2022 at 09:08:02AM +0100, Richard Biener via Gcc-patches wrote:
> On Wed, Dec 14, 2022 at 3:21 AM liuhongt via Gcc-patches
>  wrote:
> >
> > Don't add crtfastmath.o for -shared to avoid changing the MXCSR
> > register when loading a shared library.  crtfastmath.o will be used
> > only when building executables.
> >
> > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > Ok for trunk?
> 
> You reject negative -mdaz-ftz but wouldn't that be useful with
> -Ofast -mno-daz-ftz since there's otherwise no way to avoid that?

Agreed.
I even wonder if the best wouldn't be to make the option effectively
three state, default, no and yes, where if the option isn't specified
at all, then crtfastmath.o* is linked as is now except for -shared,
if it is -mno-daz-ftz, then it is never linked in regardless of other
options and if it is -mdaz-ftz, then it is linked even for -shared.

> > --- a/gcc/config/i386/i386.opt
> > +++ b/gcc/config/i386/i386.opt
> > @@ -420,6 +420,10 @@ mpc80
> >  Target RejectNegative
> >  Set 80387 floating-point precision to 80-bit.
> >
> > +mdaz-ftz
> > +Target RejectNegative
> > +Set the FTZ and DAZ Flags.

As the option is only used in the driver, shouldn't it be marked Driver
and not Target?  It doesn't need to be saved/restored on every cfun switch
etc.

> > +@item -mdaz-ftz
> > +@opindex mdaz-ftz
> > +
> > +the flush-to-zero (FTZ) and denormals-are-zero (DAZ) flags in the MXCSR 
> > register

Shouldn't description start with capital letter?

> > +are used to control floating-point calculations.SSE and AVX instructions
> > +including scalar and vector instructions could benefit from enabling the 
> > FTZ
> > +and DAZ flags when @option{-mdaz-ftz} is specified.
> 
> Maybe say that the MXCSR register is set at program start to achieve
> this when the
> flag is specified at _link_ time and say this switch is ignored when
> -shared is specified?

Jakub



Re: [Patch, Fortran] libgfortran's ISO_Fortran_binding.c: Use GCC11 version for backward-only code [PR108056]

2022-12-14 Thread Richard Biener via Gcc-patches
On Tue, Dec 13, 2022 at 5:29 PM Tobias Burnus  wrote:
>
> This is a 12/13 regression as come changes to fix the GFC/CFI descriptor
> that went into GCC 12 fail with the (bogus) descriptor passed via by a
> GCC-11-compiled program.
>
> As later GCC 12 changes moved the descriptor to the front end, those
> functions are only in libgomp.so to cater for old program. Richard
> suggested in the PR that the best way is to move to the GCC 11 version,
> such that libgfortran.so won't regress.
>
> I now did so - except for three fixes (cf. changelog). See also
> PR: https://gcc.gnu.org/PR108056
>
> There is no testcase as it needs to be compiled by GCC <= 11 and then
> run with linking (dynamically) to a GCC 12 or 13 libgfortran.
>
> OK for mainline and GCC 12?

OK for the branch if approved for trunk.

Richard.

>   * * *
>
> Note: It is strongly recommended to use GCC 12 (or 13) with array-descriptor
> C interop as many issues were fixed. Like for the testcase in the PR; in GCC 
> 11
> the type arriving in libgomp is BT_ASSUME ('type(*)'). But as the effective
> argument is passed as array descriptor through out, the 'float' (real(4)) type
> info is actually preservable (as GCC 12 cf. testcase of comment 0 and my 
> comment
> in the PR for the C part of the testcase).(*)
>
> Tobias
>
> ((*) This is not possible if using a scalar 'type(*)' or a 
> non-array-descriptor
> array in between. I think GCC 12 uses 'CFI_other' in the information-is-lost 
> case.)
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955


Re: [PATCH] [x86] x86: Don't add crtfastmath.o for -shared and add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-14 Thread Richard Biener via Gcc-patches
On Wed, Dec 14, 2022 at 9:16 AM Jakub Jelinek  wrote:
>
> On Wed, Dec 14, 2022 at 09:08:02AM +0100, Richard Biener via Gcc-patches 
> wrote:
> > On Wed, Dec 14, 2022 at 3:21 AM liuhongt via Gcc-patches
> >  wrote:
> > >
> > > Don't add crtfastmath.o for -shared to avoid changing the MXCSR
> > > register when loading a shared library.  crtfastmath.o will be used
> > > only when building executables.
> > >
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > > Ok for trunk?
> >
> > You reject negative -mdaz-ftz but wouldn't that be useful with
> > -Ofast -mno-daz-ftz since there's otherwise no way to avoid that?
>
> Agreed.
> I even wonder if the best wouldn't be to make the option effectively
> three state, default, no and yes, where if the option isn't specified
> at all, then crtfastmath.o* is linked as is now except for -shared,
> if it is -mno-daz-ftz, then it is never linked in regardless of other
> options and if it is -mdaz-ftz, then it is linked even for -shared.

Possibly.  I'd also suggest to split the changed -shared handling to
a separate patch since people may want to backport this and it
should be applicable to all other targets with similar handling.

> > > --- a/gcc/config/i386/i386.opt
> > > +++ b/gcc/config/i386/i386.opt
> > > @@ -420,6 +420,10 @@ mpc80
> > >  Target RejectNegative
> > >  Set 80387 floating-point precision to 80-bit.
> > >
> > > +mdaz-ftz
> > > +Target RejectNegative
> > > +Set the FTZ and DAZ Flags.
>
> As the option is only used in the driver, shouldn't it be marked Driver
> and not Target?  It doesn't need to be saved/restored on every cfun switch
> etc.
>
> > > +@item -mdaz-ftz
> > > +@opindex mdaz-ftz
> > > +
> > > +the flush-to-zero (FTZ) and denormals-are-zero (DAZ) flags in the MXCSR 
> > > register
>
> Shouldn't description start with capital letter?
>
> > > +are used to control floating-point calculations.SSE and AVX instructions
> > > +including scalar and vector instructions could benefit from enabling the 
> > > FTZ
> > > +and DAZ flags when @option{-mdaz-ftz} is specified.
> >
> > Maybe say that the MXCSR register is set at program start to achieve
> > this when the
> > flag is specified at _link_ time and say this switch is ignored when
> > -shared is specified?
>
> Jakub
>


[PATCH] RISC-V: Add testcases for VSETVL PASS 5

2022-12-14 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-1.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-10.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-11.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-12.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-13.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-14.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-15.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-16.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-17.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-18.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-19.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-2.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-20.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-21.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-22.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-23.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-24.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-27.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-28.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-29.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-3.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-30.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-31.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-32.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-33.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-34.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-35.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-36.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-37.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-38.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-39.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-4.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-40.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-41.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-42.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-43.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-44.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-45.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-46.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-5.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-6.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-7.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-8.c: New test.
* gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-9.c: New test.

---
 .../riscv/rvv/vsetvl/vlmax_back_prop-1.c  |  36 
 .../riscv/rvv/vsetvl/vlmax_back_prop-10.c |  59 +++
 .../riscv/rvv/vsetvl/vlmax_back_prop-11.c |  63 +++
 .../riscv/rvv/vsetvl/vlmax_back_prop-12.c |  64 
 .../riscv/rvv/vsetvl/vlmax_back_prop-13.c |  64 
 .../riscv/rvv/vsetvl/vlmax_back_prop-14.c |  58 +++
 .../riscv/rvv/vsetvl/vlmax_back_prop-15.c | 143 
 .../riscv/rvv/vsetvl/vlmax_back_prop-16.c |  54 ++
 .../riscv/rvv/vsetvl/vlmax_back_prop-17.c |  59 +++
 .../riscv/rvv/vsetvl/vlmax_back_prop-18.c |  58 +++
 .../riscv/rvv/vsetvl/vlmax_back_prop-19.c |  48 ++
 .../riscv/rvv/vsetvl/vlmax_back_prop-2.c  |  50 ++
 .../riscv/rvv/vsetvl/vlmax_back_prop-20.c |  59 +++
 .../riscv/rvv/vsetvl/vlmax_back_prop-21.c |  50 ++
 .../riscv/rvv/vsetvl/vlmax_back_prop-22.c |  58 +++
 .../riscv/rvv/vsetvl/vlmax_back_prop-23.c |  41 +
 .../riscv/rvv/vsetvl/vlmax_back_prop-24.c |  41 +
 .../riscv/rvv/vsetvl/vlmax_back_prop-25.c |  96 +++
 .../riscv/rvv/vsetvl/vlmax_back_prop-26.c |  89 ++
 .../riscv/rvv/vsetvl/vlmax_back_prop-27.c |  51 ++
 .../riscv/rvv/vsetvl/vlmax_back_prop-28.c |  54 ++
 .../riscv/rvv/vsetvl/vlmax_back_prop-29.c |  54 ++
 .../riscv/rvv/vsetvl/vlmax_back_prop-3.c  |  47 ++
 .../riscv/rvv/vsetvl/vlmax_back_prop-30.c |  44 +
 .../riscv/rvv/vsetvl/vlmax_back_prop-31.c |  46 ++
 .../riscv/rvv/vsetvl/vlmax_back_prop-32.c |  46 ++
 .../riscv/rvv/vsetvl/vlmax_back_prop-33.c |  45 +
 .../riscv/rvv/vsetvl/vlmax_back_prop-34.c |  45 +
 .../riscv/rvv/vsetvl/vlmax_back_

Re: [PATCH V6] rs6000: Optimize cmp on rotated 16bits constant

2022-12-14 Thread Jiufu Guo via Gcc-patches
Hi,

Segher Boessenkool  writes:

> Hi!
>
> Sorry for the tardiness.
>
> On Mon, Aug 29, 2022 at 11:42:16AM +0800, Jiufu Guo wrote:
>> When checking eq/ne with a constant which has only 16bits, it can be
>> optimized to check the rotated data.  By this, the constant building
>> is optimized.
>> 
>> As the example in PR103743:
>> For "in == 0x8000LL", this patch generates:
>> rotldi %r3,%r3,16
>> cmpldi %cr0,%r3,32768
>> instead:
>> li %r9,-1
>> rldicr %r9,%r9,0,0
>> cmpd %cr0,%r3,%r9
>
> FWIW, I find the winnt assembler syntax very hard to read, and I doubt
> I am the only one.
Oh, sorry about that.  I will avoid to add '-mregnames' to dump asm. :)
BTW, what options are you used to dump asm code? 
>
> So you're doing
>   rotldi 3,3,16 ; cmpldi 3,0x8000
> instead of
>   li 9,-1 ; rldicr 9,9,0,0 ; cmpd 3,9
>
>> +/* Check if C can be rotated from an immediate which starts (as 64bit 
>> integer)
>> +   with at least CLZ bits zero.
>> +
>> +   Return the number by which C can be rotated from the immediate.
>> +   Return -1 if C can not be rotated as from.  */
>> +
>> +int
>> +rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT c, int clz)
>
> The name does not say what the function does.  Can you think of a better
> name?
>
> Maybe it is better to not return magic values anyway?  So perhaps
>
> bool
> can_be_done_as_compare_of_rotate (unsigned HOST_WIDE_INT c, int clz, int *rot)
>
> (with *rot written if the return value is true).
Thanks for your suggestion!
It is checking if a constant can be rotated from/to a value which has
only few tailing nonzero bits (all leading bits are zeros). 

So, I'm thinking to name the function as something like:
can_be_rotated_to_lowbits.

>
>> +  /* case c. xx10.0xx: rotate 'clz + 1' bits firstly, then check case b.
>
> s/firstly/first/
Thanks! 
>
>> +/* Check if C can be rotated from an immediate operand of cmpdi or cmpldi.  
>> */
>> +
>> +bool
>> +compare_rotate_immediate_p (unsigned HOST_WIDE_INT c)
>
> No _p please, this function is not a predicate (at least, the name does
> not say what it tests).  So a better name please.  This matters even
> more for extern functions (like this one) because the function
> implementation is always farther away so you do not easily have all
> interface details in mind.  Good names help :-)
Thanks! Name is always a matter. :)

Maybe we can name this funciton as "can_be_rotated_as_compare_operand",
or "is_constant_rotateable_for_compare", because this function checks
"if a constant can be rotated to/from an immediate operand of
cmpdi/cmpldi". 

>
>> +(define_code_iterator eqne [eq ne])
>> +(define_code_attr EQNE [(eq "EQ") (ne "NE")])
>
> Just  or  should work?
Great! Thanks for point out this!  works.
>
> Please fix these things.  Almost there :-)

I updated the patch as below. Bootstraping and regtesting is ongoing.
Thanks again for your careful and insight review!


BR,
Jeff (Jiufu)

--
When checking eq/ne with a constant which has only 16bits, it can be
optimized to check the rotated data.  By this, the constant building
is optimized.

As the example in PR103743:
For "in == 0x8000LL", this patch generates:
rotldi 3,3,1 ; cmpldi 0,3,1
instead of:
li 9,-1 ; rldicr 9,9,0,0 ; cmpd 0,3,9

Compare with previous version:
This patch refactor the code according to review comments.
e.g. updating function names/comments/code.


PR target/103743

gcc/ChangeLog:

* config/rs6000/rs6000-protos.h (can_be_rotated_to_lowbits): New.
(can_be_rotated_as_compare_operand): New.
* config/rs6000/rs6000.cc (can_be_rotated_to_lowbits): New definition.
(can_be_rotated_as_compare_operand): New definition.
* config/rs6000/rs6000.md (*rotate_on_cmpdi): New define_insn_and_split.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr103743.c: New test.
* gcc.target/powerpc/pr103743_1.c: New test.

---
 gcc/config/rs6000/rs6000-protos.h |  2 +
 gcc/config/rs6000/rs6000.cc   | 56 +++
 gcc/config/rs6000/rs6000.md   | 63 +++-
 gcc/testsuite/gcc.target/powerpc/pr103743.c   | 52 ++
 gcc/testsuite/gcc.target/powerpc/pr103743_1.c | 95 +++
 5 files changed, 267 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743_1.c

diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index d0d89320ef6..9626917e359 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -35,6 +35,8 @@ extern bool xxspltib_constant_p (rtx, machine_mode, int *, 
int *);
 extern int vspltis_shifted (rtx);
 extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
 extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
+extern bool can_be_rotated_to_lowbits (unsigned HOST_WIDE_INT, int, int *);
+ext

RE: [PATCH] [x86] x86: Don't add crtfastmath.o for -shared and add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-14 Thread Liu, Hongtao via Gcc-patches


> -Original Message-
> From: Richard Biener 
> Sent: Wednesday, December 14, 2022 4:23 PM
> To: Jakub Jelinek 
> Cc: Liu, Hongtao ; gcc-patches@gcc.gnu.org;
> crazy...@gmail.com; hjl.to...@gmail.com; ubiz...@gmail.com
> Subject: Re: [PATCH] [x86] x86: Don't add crtfastmath.o for -shared and add a
> new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.
> 
> On Wed, Dec 14, 2022 at 9:16 AM Jakub Jelinek  wrote:
> >
> > On Wed, Dec 14, 2022 at 09:08:02AM +0100, Richard Biener via Gcc-patches
> wrote:
> > > On Wed, Dec 14, 2022 at 3:21 AM liuhongt via Gcc-patches
> > >  wrote:
> > > >
> > > > Don't add crtfastmath.o for -shared to avoid changing the MXCSR
> > > > register when loading a shared library.  crtfastmath.o will be
> > > > used only when building executables.
> > > >
> > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > > > Ok for trunk?
> > >
> > > You reject negative -mdaz-ftz but wouldn't that be useful with
> > > -Ofast -mno-daz-ftz since there's otherwise no way to avoid that?
> >
> > Agreed.
> > I even wonder if the best wouldn't be to make the option effectively
> > three state, default, no and yes, where if the option isn't specified
> > at all, then crtfastmath.o* is linked as is now except for -shared, if
> > it is -mno-daz-ftz, then it is never linked in regardless of other
> > options and if it is -mdaz-ftz, then it is linked even for -shared.
> 
> Possibly.  I'd also suggest to split the changed -shared handling to a 
> separate
> patch since people may want to backport this and it should be applicable to
> all other targets with similar handling.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55522#c26
So patch in the upper link is ok for trunk?
I'll change -mdaz-ftz part as a separate patch.
> 
> > > > --- a/gcc/config/i386/i386.opt
> > > > +++ b/gcc/config/i386/i386.opt
> > > > @@ -420,6 +420,10 @@ mpc80
> > > >  Target RejectNegative
> > > >  Set 80387 floating-point precision to 80-bit.
> > > >
> > > > +mdaz-ftz
> > > > +Target RejectNegative
> > > > +Set the FTZ and DAZ Flags.
> >
> > As the option is only used in the driver, shouldn't it be marked
> > Driver and not Target?  It doesn't need to be saved/restored on every
> > cfun switch etc.
> >
> > > > +@item -mdaz-ftz
> > > > +@opindex mdaz-ftz
> > > > +
> > > > +the flush-to-zero (FTZ) and denormals-are-zero (DAZ) flags in the
> > > > +MXCSR register
> >
> > Shouldn't description start with capital letter?
> >
> > > > +are used to control floating-point calculations.SSE and AVX
> > > > +instructions including scalar and vector instructions could
> > > > +benefit from enabling the FTZ and DAZ flags when @option{-mdaz-ftz}
> is specified.
> > >
> > > Maybe say that the MXCSR register is set at program start to achieve
> > > this when the flag is specified at _link_ time and say this switch
> > > is ignored when -shared is specified?
> >
> > Jakub
> >


Re: [PATCH v5 1/19] modula2 front end: Fixes, improvements detecting python3 and documentation generation (shorter).

2022-12-14 Thread Gaius Mulley via Gcc-patches
Richard Biener  writes:

> On Wed, Dec 14, 2022 at 8:48 AM Gaius Mulley  wrote:
>>
>>
>>
>> This patch set adds a re-exp ACX_CHECK_PROG_VER to detect python3.
>> HAVE_PYTHON is then checked in gcc/m2/Make-lang.in to generate library
>> chapters if python3 is available.  If python3 is unavailable then the
>> chapters are copied from a target-independent version.
>>
>> Bugfixed --enable-generated-files-in-srcdir.
>>
>> Python3 modules section added to install.texi.
>>
>> Also included are the target-independent versions of the
>> documentation.  The only difference is in the SYSTEM module which if
>> generated when HAVE_PYTHON is "yes" will enumerate all fundamental
>> data types supported by the target and compiler.
>>
>> I've hand snipped to try and reduce the size/noise as some of
>> these files have already been reviewed.
>>
>> I'll post the unedited version as well for completness,
>
> LGTM.

thanks - this is the last patch tick.  So I'll actually do the merge now :-)

> I'll note that in other GCC manuals we have target dependent
> things documented for each supported target.  It looks like
> the M2 docs will have only documentation built for the target
> the compiler is built for?  So for example the online documentation
> hosted on gcc.gnu.org will then contain only documentation for
> the x86_64-linux target specific SYSTEM module?

true

> building documentation for openSUSE (caveat: we only build
> .info docs and manpages) the documentation is only built once
> (aka for rpm 'noarch') with the idea it will be the same on all targets.
>
> So it seems to me that enumerating the SYSTEM module documentation
> for all targets or somehow differently organizing it would be better?

yes it would.  There is a desire to move the non standard data types out
of SYSTEM (by the user community) for the future.  But having a comment
saying data type availability dependant upon target architecture would
suffice for now, I'll add a comment,

regards,
Gaius


[PATCH] RISC-V: Fix annotation

2022-12-14 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc: Fix annotation.

---
 gcc/config/riscv/riscv-vsetvl.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index c602426b542..3ca3fc15e5a 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -35,7 +35,7 @@ along with GCC; see the file COPYING3.  If not see
 
 -  Each avl operand is either an immediate (must be in range 0 ~ 31) or 
reg.
 
-This pass consists of 3 phases:
+This pass consists of 5 phases:
 
 -  Phase 1 - compute VL/VTYPE demanded information within each block
by backward data-flow analysis.
-- 
2.36.3



Re: [PATCH 1/3] Rework 128-bit complex multiply and divide, PR target/107299

2022-12-14 Thread Kewen.Lin via Gcc-patches
on 2022/12/13 14:14, Michael Meissner wrote:
> On Mon, Dec 12, 2022 at 06:20:14PM +0800, Kewen.Lin wrote:
>> Without or with patch #1, the below ICE in libgcc exists, the ICE should have
>> nothing to do with the special handling for building_libgcc in patch #1.  I
>> think patch #2 which makes _Float128 and __float128 use the same internal
>> type fixes that ICE.
>>
>> I still don't get the point why we need the special handling for 
>> building_libgcc,
>> I also tested on top of patch #1 and #2 w/ and w/o the special handling for
>> building_libgcc, both bootstrapped and regress-tested.
>>
>> Could you have a double check?
> 
> As long as patch #2 and #3 are installed, we don't need the special handling
> for building_libgcc.  Good catch.
> 
> I will send out a replacement patch for it.

Thanks!  I still feel patch #1 is independent, it helps to fix the issues as
shown in its associated test case, which looks an oversight in the previous
implementation to me. :)

> 
>> Since your patch #2 (and #3) fixes ICE and some exposed problems, and 
>> _Float128
>> is to use the same internal type as __float128, types with 
>> attribute((mode(TF)))
>> and attribute((mode(TC))) should be correct, I assume that this patch is just
>> to make the types explicit be with _Float128 (for better readability and
>> maintainance), but not for any correctness issues.
> 
> Yes, the patch is mainly for clarity.  The history is the libgcc support went
> in before _Float128 went in, and I never went back to use those types when we
> could use them.
> 
> With _Float128, we can just use _Complex _Float128 and not
> bother with trying to get the right KC/TC for the attribute mode stuff.
> 
> However, if patches 1-3 aren't put in, just applying the patch to use 
> _Float128
> and _Complex _Float128 would fix the immediate problem (of not building GCC on
> systems with IEEE 128-bit long double).  However, it is a band-aid that just
> works around the problem of building __mulkc3 and __divkc3.  It doesn't fix 
> the
> other problems between __float128 and _Float128 that show up in some places
> that I would like to get fixed.
> 
> So I haven't submitted the patch, because I think it is more important to get
> the other issues fixed.

OK, make sense, thanks for the clarification!

BR,
Kewen


Re: [PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299

2022-12-14 Thread Kewen.Lin via Gcc-patches
on 2022/12/6 19:27, Kewen.Lin via Gcc-patches wrote:
> Hi Mike,
> 
> Thanks for fixing this, some comments are inlined below.
> 
> on 2022/11/2 10:42, Michael Meissner wrote:
>> This patch fixes the issue that GCC cannot build when the default long double
>> is IEEE 128-bit.  It fails in building libgcc, specifically when it is trying
>> to buld the __mulkc3 function in libgcc.  It is failing in 
>> gimple-range-fold.cc
>> during the evrp pass.  Ultimately it is failing because the code declared the
>> type to use TFmode but it used F128 functions (i.e. KFmode).

By further looking into this, I found that though __float128 and _Float128 types
are two different types, they have the same mode TFmode, the unexpected thing is
these two types have different precision.  I noticed it's due to the 
"workaround"
in build_common_tree_nodes:

  /* Work around the rs6000 KFmode having precision 113 not
 128.  */
  const struct real_format *fmt = REAL_MODE_FORMAT (mode);
  gcc_assert (fmt->b == 2 && fmt->emin + fmt->emax == 3);
  int min_precision = fmt->p + ceil_log2 (fmt->emax - fmt->emin);
  if (!extended)
gcc_assert (min_precision == n);
  if (precision < min_precision)
precision = min_precision;

Since function useless_type_conversion_p considers two float types are 
compatible
if they have the same mode, so it doesn't require the explicit conversions 
between
these two types.  I think it's exactly what we want.  And to me, it looks 
unexpected
to have two types with the same mode but different precision.

So could we consider disabling the above workaround to make _Float128 have the 
same
precision as __float128 (long double) (the underlying TFmode)?  I tried the 
below
change:

diff --git a/gcc/tree.cc b/gcc/tree.cc
index 254b2373dcf..10fcb3d88ca 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -9442,6 +9442,7 @@ build_common_tree_nodes (bool signed_char)
   if (!targetm.floatn_mode (n, extended).exists (&mode))
 continue;
   int precision = GET_MODE_PRECISION (mode);
+#if 0
   /* Work around the rs6000 KFmode having precision 113 not
  128.  */
   const struct real_format *fmt = REAL_MODE_FORMAT (mode);
@@ -9451,6 +9452,7 @@ build_common_tree_nodes (bool signed_char)
 gcc_assert (min_precision == n);
   if (precision < min_precision)
 precision = min_precision;
+#endif
   FLOATN_NX_TYPE_NODE (i) = make_node (REAL_TYPE);
   TYPE_PRECISION (FLOATN_NX_TYPE_NODE (i)) = precision;
   layout_type (FLOATN_NX_TYPE_NODE (i));

It can be bootstrapped (fixing the ICE in PR107299).  Comparing with the 
baseline
regression test results with patch #1, #2 and #3, I got some positive:

FAIL->PASS: 17_intro/headers/c++2020/all_attributes.cc (test for excess errors)
FAIL->PASS: 17_intro/headers/c++2020/all_no_exceptions.cc (test for excess 
errors)
FAIL->PASS: 17_intro/headers/c++2020/all_no_rtti.cc (test for excess errors)
FAIL->PASS: 17_intro/headers/c++2020/all_pedantic_errors.cc (test for excess 
errors)
FAIL->PASS: 17_intro/headers/c++2020/operator_names.cc (test for excess errors)
FAIL->PASS: 17_intro/headers/c++2020/stdc++.cc (test for excess errors)
FAIL->PASS: 17_intro/headers/c++2020/stdc++_multiple_inclusion.cc (test for 
excess errors)
FAIL->PASS: std/format/arguments/args.cc (test for excess errors)
FAIL->PASS: std/format/error.cc (test for excess errors)
FAIL->PASS: std/format/formatter/requirements.cc (test for excess errors)
FAIL->PASS: std/format/functions/format.cc (test for excess errors)
FAIL->PASS: std/format/functions/format_to_n.cc (test for excess errors)
FAIL->PASS: std/format/functions/size.cc (test for excess errors)
FAIL->PASS: std/format/functions/vformat_to.cc (test for excess errors)
FAIL->PASS: std/format/parse_ctx.cc (test for excess errors)
FAIL->PASS: std/format/string.cc (test for excess errors)
FAIL->PASS: std/format/string_neg.cc (test for excess errors)
FAIL->PASS: g++.dg/cpp23/ext-floating1.C  -std=gnu++23 (test for excess errors)

and some negative:

PASS->FAIL: gcc.dg/torture/float128-nan.c   -O0  execution test
PASS->FAIL: gcc.dg/torture/float128-nan.c   -O1  execution test
PASS->FAIL: gcc.dg/torture/float128-nan.c   -O2  execution test
PASS->FAIL: gcc.dg/torture/float128-nan.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  execution test
PASS->FAIL: gcc.dg/torture/float128-nan.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  execution test
PASS->FAIL: gcc.dg/torture/float128-nan.c   -O3 -g  execution test
PASS->FAIL: gcc.dg/torture/float128-nan.c   -Os  execution test
PASS->FAIL: gcc.target/powerpc/nan128-1.c execution test

The negative part is about nan, I haven't looked into it, but I think it may be 
the
reason why we need the workaround there, CC Joseph.  Anyway it needs more 
investigation
here, but IMHO the required information (ie. the actual precision) can be 
retrieved
from REAL_MODE_FORMAT(mode) of TYPE_MODE, so it should be doable to fix some 
othe

[PATCH] RISC-V: Remove unused redundant vector attributes

2022-12-14 Thread juzhe . zhong
From: Ju-Zhe Zhong 

I found that I forgot to remove these redundant attributes.
Sorry about that.

gcc/ChangeLog:

* config/riscv/vector.md (): Remove redundant attributes.

---
 gcc/config/riscv/vector.md | 20 
 1 file changed, 20 deletions(-)

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 3bfda652318..7dfadaa96b6 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -219,26 +219,6 @@
 (const_int 4)]
(const_int INVALID_ATTRIBUTE)))
 
-;; The index of operand[] to get the tail policy op.
-(define_attr "tail_policy_op_idx" ""
-  (cond [(eq_attr "type" "vlde,vste,vimov,vfmov,vlds")
-(const_int 5)]
-   (const_int INVALID_ATTRIBUTE)))
-
-;; The index of operand[] to get the mask policy op.
-(define_attr "mask_policy_op_idx" ""
-  (cond [(eq_attr "type" "vlde,vste,vlds")
-(const_int 6)]
-   (const_int INVALID_ATTRIBUTE)))
-
-;; The index of operand[] to get the mask policy op.
-(define_attr "avl_type_op_idx" ""
-  (cond [(eq_attr "type" "vlde,vlde,vste,vimov,vimov,vimov,vfmov,vlds,vlds")
-(const_int 7)
-(eq_attr "type" "vldm,vstm,vimov,vmalu,vmalu")
-(const_int 5)]
-   (const_int INVALID_ATTRIBUTE)))
-
 ;; The tail policy op value.
 (define_attr "ta" ""
   (cond [(eq_attr "type" "vlde,vste,vimov,vfmov,vlds")
-- 
2.36.3



Re: [PATCH] [x86] x86: Don't add crtfastmath.o for -shared and add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-14 Thread Richard Biener via Gcc-patches
On Wed, Dec 14, 2022 at 9:34 AM Liu, Hongtao  wrote:
>
>
>
> > -Original Message-
> > From: Richard Biener 
> > Sent: Wednesday, December 14, 2022 4:23 PM
> > To: Jakub Jelinek 
> > Cc: Liu, Hongtao ; gcc-patches@gcc.gnu.org;
> > crazy...@gmail.com; hjl.to...@gmail.com; ubiz...@gmail.com
> > Subject: Re: [PATCH] [x86] x86: Don't add crtfastmath.o for -shared and add 
> > a
> > new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.
> >
> > On Wed, Dec 14, 2022 at 9:16 AM Jakub Jelinek  wrote:
> > >
> > > On Wed, Dec 14, 2022 at 09:08:02AM +0100, Richard Biener via Gcc-patches
> > wrote:
> > > > On Wed, Dec 14, 2022 at 3:21 AM liuhongt via Gcc-patches
> > > >  wrote:
> > > > >
> > > > > Don't add crtfastmath.o for -shared to avoid changing the MXCSR
> > > > > register when loading a shared library.  crtfastmath.o will be
> > > > > used only when building executables.
> > > > >
> > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > > > > Ok for trunk?
> > > >
> > > > You reject negative -mdaz-ftz but wouldn't that be useful with
> > > > -Ofast -mno-daz-ftz since there's otherwise no way to avoid that?
> > >
> > > Agreed.
> > > I even wonder if the best wouldn't be to make the option effectively
> > > three state, default, no and yes, where if the option isn't specified
> > > at all, then crtfastmath.o* is linked as is now except for -shared, if
> > > it is -mno-daz-ftz, then it is never linked in regardless of other
> > > options and if it is -mdaz-ftz, then it is linked even for -shared.
> >
> > Possibly.  I'd also suggest to split the changed -shared handling to a 
> > separate
> > patch since people may want to backport this and it should be applicable to
> > all other targets with similar handling.
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55522#c26
> So patch in the upper link is ok for trunk?

It needs target maintainer approval but yes, I think that's what we want.

Richard.

> I'll change -mdaz-ftz part as a separate patch.
> >
> > > > > --- a/gcc/config/i386/i386.opt
> > > > > +++ b/gcc/config/i386/i386.opt
> > > > > @@ -420,6 +420,10 @@ mpc80
> > > > >  Target RejectNegative
> > > > >  Set 80387 floating-point precision to 80-bit.
> > > > >
> > > > > +mdaz-ftz
> > > > > +Target RejectNegative
> > > > > +Set the FTZ and DAZ Flags.
> > >
> > > As the option is only used in the driver, shouldn't it be marked
> > > Driver and not Target?  It doesn't need to be saved/restored on every
> > > cfun switch etc.
> > >
> > > > > +@item -mdaz-ftz
> > > > > +@opindex mdaz-ftz
> > > > > +
> > > > > +the flush-to-zero (FTZ) and denormals-are-zero (DAZ) flags in the
> > > > > +MXCSR register
> > >
> > > Shouldn't description start with capital letter?
> > >
> > > > > +are used to control floating-point calculations.SSE and AVX
> > > > > +instructions including scalar and vector instructions could
> > > > > +benefit from enabling the FTZ and DAZ flags when @option{-mdaz-ftz}
> > is specified.
> > > >
> > > > Maybe say that the MXCSR register is set at program start to achieve
> > > > this when the flag is specified at _link_ time and say this switch
> > > > is ignored when -shared is specified?
> > >
> > > Jakub
> > >


Re: Ping---[V3][PATCH 2/2] Add a new warning option -Wstrict-flex-arrays.

2022-12-14 Thread Richard Biener via Gcc-patches
On Tue, 13 Dec 2022, Qing Zhao wrote:

> Richard, 
> 
> Do you have any decision on this one? 
> Do we need this warning option For GCC? 

Looking at the testcases it seems that the diagnostic amends
-Warray-bounds diagnostics for trailing but not flexible arrays?
Wouldn't it be better to generally diagnose this, so have
-Warray-bounds, with -fstrict-flex-arrays, for

struct X { int a[1]; };
int foo (struct X *p)
{
  return p->a[1];
}

emit

warning: array subscript 1 is above array bounds ...
note: the trailing array is only a flexible array member with 
-fno-strict-flex-arrays

?  Having -Wstrict-flex-arrays=N and N not agree with the
-fstrict-flex-arrays level sounds hardly useful to me but the
information that we ran into a trailing array but didn't consider
it a flex array because of -fstrict-flex-arrays is always a
useful information?

But maybe I misunderstood this new diagnostic?

Thanks,
Richard.


> thanks.
> 
> Qing
> 
> > On Dec 6, 2022, at 11:18 AM, Qing Zhao  wrote:
> > 
> > '-Wstrict-flex-arrays'
> > Warn about inproper usages of flexible array members according to
> > the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
> > the trailing array field of a structure if it's available,
> > otherwise according to the LEVEL of the option
> > '-fstrict-flex-arrays=LEVEL'.
> > 
> > This option is effective only when LEVEL is bigger than 0.
> > Otherwise, it will be ignored with a warning.
> > 
> > when LEVEL=1, warnings will be issued for a trailing array
> > reference of a structure that have 2 or more elements if the
> > trailing array is referenced as a flexible array member.
> > 
> > when LEVEL=2, in addition to LEVEL=1, additional warnings will be
> > issued for a trailing one-element array reference of a structure if
> > the array is referenced as a flexible array member.
> > 
> > when LEVEL=3, in addition to LEVEL=2, additional warnings will be
> > issued for a trailing zero-length array reference of a structure if
> > the array is referenced as a flexible array member.
> > 
> > gcc/ChangeLog:
> > 
> > * doc/invoke.texi: Document -Wstrict-flex-arrays option.
> > * gimple-array-bounds.cc (check_out_of_bounds_and_warn): Add two more
> > arguments.
> > (array_bounds_checker::check_array_ref): Issue warnings for
> > -Wstrict-flex-arrays.
> > * opts.cc (finish_options): Issue warning for unsupported combination
> > of -Wstrict_flex_arrays and -fstrict-flex-array.
> > * tree-vrp.cc (execute_ranger_vrp): Enable the pass when
> > warn_strict_flex_array is true.
> > 
> > gcc/c-family/ChangeLog:
> > 
> > * c.opt (Wstrict-flex-arrays): New option.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * gcc.dg/Warray-bounds-flex-arrays-1.c: Update testing case with
> > -Wstrict-flex-arrays.
> > * gcc.dg/Warray-bounds-flex-arrays-2.c: Likewise.
> > * gcc.dg/Warray-bounds-flex-arrays-3.c: Likewise.
> > * gcc.dg/Warray-bounds-flex-arrays-4.c: Likewise.
> > * gcc.dg/Warray-bounds-flex-arrays-5.c: Likewise.
> > * gcc.dg/Warray-bounds-flex-arrays-6.c: Likewise.
> > * c-c++-common/Wstrict-flex-arrays.c: New test.
> > * gcc.dg/Wstrict-flex-arrays-2.c: New test.
> > * gcc.dg/Wstrict-flex-arrays-3.c: New test.
> > * gcc.dg/Wstrict-flex-arrays.c: New test.
> > ---
> > gcc/c-family/c.opt|   5 +
> > gcc/doc/invoke.texi   |  27 -
> > gcc/gimple-array-bounds.cc| 103 ++
> > gcc/opts.cc   |   8 ++
> > .../c-c++-common/Wstrict-flex-arrays.c|   9 ++
> > .../gcc.dg/Warray-bounds-flex-arrays-1.c  |   5 +-
> > .../gcc.dg/Warray-bounds-flex-arrays-2.c  |   6 +-
> > .../gcc.dg/Warray-bounds-flex-arrays-3.c  |   7 +-
> > .../gcc.dg/Warray-bounds-flex-arrays-4.c  |   5 +-
> > .../gcc.dg/Warray-bounds-flex-arrays-5.c  |   6 +-
> > .../gcc.dg/Warray-bounds-flex-arrays-6.c  |   7 +-
> > gcc/testsuite/gcc.dg/Wstrict-flex-arrays-2.c  |  39 +++
> > gcc/testsuite/gcc.dg/Wstrict-flex-arrays-3.c  |  39 +++
> > gcc/testsuite/gcc.dg/Wstrict-flex-arrays.c|  39 +++
> > gcc/tree-vrp.cc   |   2 +-
> > 15 files changed, 273 insertions(+), 34 deletions(-)
> > create mode 100644 gcc/testsuite/c-c++-common/Wstrict-flex-arrays.c
> > create mode 100644 gcc/testsuite/gcc.dg/Wstrict-flex-arrays-2.c
> > create mode 100644 gcc/testsuite/gcc.dg/Wstrict-flex-arrays-3.c
> > create mode 100644 gcc/testsuite/gcc.dg/Wstrict-flex-arrays.c
> > 
> > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> > index 0d0ad0a6374..33edeefd285 100644
> > --- a/gcc/c-family/c.opt
> > +++ b/gcc/c-family/c.opt
> > @@ -976,6 +976,11 @@ Wstringop-truncation
> > C ObjC C++ LTO ObjC++ Var(warn_stringop_truncation) Warning Init (1) 
> > LangEnabledBy(C ObjC C++ LTO ObjC++, Wall)
> > Warn about truncation in string manipulation functi

Re: [PATCH] [x86] x86: Don't add crtfastmath.o for -shared and add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-14 Thread Uros Bizjak via Gcc-patches
On Wed, Dec 14, 2022 at 9:52 AM Richard Biener
 wrote:
>
> On Wed, Dec 14, 2022 at 9:34 AM Liu, Hongtao  wrote:
> >
> >
> >
> > > -Original Message-
> > > From: Richard Biener 
> > > Sent: Wednesday, December 14, 2022 4:23 PM
> > > To: Jakub Jelinek 
> > > Cc: Liu, Hongtao ; gcc-patches@gcc.gnu.org;
> > > crazy...@gmail.com; hjl.to...@gmail.com; ubiz...@gmail.com
> > > Subject: Re: [PATCH] [x86] x86: Don't add crtfastmath.o for -shared and 
> > > add a
> > > new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.
> > >
> > > On Wed, Dec 14, 2022 at 9:16 AM Jakub Jelinek  wrote:
> > > >
> > > > On Wed, Dec 14, 2022 at 09:08:02AM +0100, Richard Biener via Gcc-patches
> > > wrote:
> > > > > On Wed, Dec 14, 2022 at 3:21 AM liuhongt via Gcc-patches
> > > > >  wrote:
> > > > > >
> > > > > > Don't add crtfastmath.o for -shared to avoid changing the MXCSR
> > > > > > register when loading a shared library.  crtfastmath.o will be
> > > > > > used only when building executables.
> > > > > >
> > > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> > > > > > Ok for trunk?
> > > > >
> > > > > You reject negative -mdaz-ftz but wouldn't that be useful with
> > > > > -Ofast -mno-daz-ftz since there's otherwise no way to avoid that?
> > > >
> > > > Agreed.
> > > > I even wonder if the best wouldn't be to make the option effectively
> > > > three state, default, no and yes, where if the option isn't specified
> > > > at all, then crtfastmath.o* is linked as is now except for -shared, if
> > > > it is -mno-daz-ftz, then it is never linked in regardless of other
> > > > options and if it is -mdaz-ftz, then it is linked even for -shared.
> > >
> > > Possibly.  I'd also suggest to split the changed -shared handling to a 
> > > separate
> > > patch since people may want to backport this and it should be applicable 
> > > to
> > > all other targets with similar handling.
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55522#c26
> > So patch in the upper link is ok for trunk?
>
> It needs target maintainer approval but yes, I think that's what we want.

Yes, the one-liner is OK. Perhaps it also needs a corresponding
documentation change (haven't checked the documentation).

Uros.

>
> Richard.
>
> > I'll change -mdaz-ftz part as a separate patch.
> > >
> > > > > > --- a/gcc/config/i386/i386.opt
> > > > > > +++ b/gcc/config/i386/i386.opt
> > > > > > @@ -420,6 +420,10 @@ mpc80
> > > > > >  Target RejectNegative
> > > > > >  Set 80387 floating-point precision to 80-bit.
> > > > > >
> > > > > > +mdaz-ftz
> > > > > > +Target RejectNegative
> > > > > > +Set the FTZ and DAZ Flags.
> > > >
> > > > As the option is only used in the driver, shouldn't it be marked
> > > > Driver and not Target?  It doesn't need to be saved/restored on every
> > > > cfun switch etc.
> > > >
> > > > > > +@item -mdaz-ftz
> > > > > > +@opindex mdaz-ftz
> > > > > > +
> > > > > > +the flush-to-zero (FTZ) and denormals-are-zero (DAZ) flags in the
> > > > > > +MXCSR register
> > > >
> > > > Shouldn't description start with capital letter?
> > > >
> > > > > > +are used to control floating-point calculations.SSE and AVX
> > > > > > +instructions including scalar and vector instructions could
> > > > > > +benefit from enabling the FTZ and DAZ flags when @option{-mdaz-ftz}
> > > is specified.
> > > > >
> > > > > Maybe say that the MXCSR register is set at program start to achieve
> > > > > this when the flag is specified at _link_ time and say this switch
> > > > > is ignored when -shared is specified?
> > > >
> > > > Jakub
> > > >


[PATCH] rust: Fix up aarch64-linux bootstrap [PR106072]

2022-12-14 Thread Jakub Jelinek via Gcc-patches
Hi!

Bootstrap fails on aarch64-linux and some other arches with:
.../gcc/rust/parse/rust-parse-impl.h: In member function 
‘Rust::AST::ClosureParam 
Rust::Parser::parse_closure_param() [with 
ManagedTokenSource = Rust::Lexer]’:
.../gcc/rust/parse/rust-parse-impl.h:8916:49: error: ‘this’ pointer is null 
[-Werror=nonnull]
The problem is that while say on x86_64-linux the side-effects in the
arguments are evaluated from last argument to first, on aarch64-linux
it is the other way around, from first to last.  The C++ I believe even
in C++17 makes the evaluation of those side-effects unordered
(indeterminately sequenced with no interleaving), so that is fine.
But, when the call in return statement is evaluated from first to
last, std::move (pattern) happens before pattern->get_locus () and
the former will make pattern (std::unique_ptr) a wrapper object around
nullptr, so dereferencing it later to call get_locus () on it is invalid.

The following patch fixes that, ok for trunk?

2022-12-14  Jakub Jelinek  

PR rust/106072
* parse/rust-parse-impl.h (parse_closure_param): Store
pattern->get_locus () in a temporary before std::move (pattern) is
invoked.

--- gcc/rust/parse/rust-parse-impl.h.jj 2022-12-13 16:50:12.708093521 +0100
+++ gcc/rust/parse/rust-parse-impl.h2022-12-14 09:50:31.73932 +0100
@@ -8912,8 +8912,9 @@ Parser::parse_closur
}
 }
 
-  return AST::ClosureParam (std::move (pattern), pattern->get_locus (),
-   std::move (type), std::move (outer_attrs));
+  Location loc = pattern->get_locus ();
+  return AST::ClosureParam (std::move (pattern), loc, std::move (type),
+   std::move (outer_attrs));
 }
 
 // Parses a grouped or tuple expression (disambiguates).

Jakub



Re: [PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299

2022-12-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Dec 14, 2022 at 04:46:07PM +0800, Kewen.Lin via Gcc-patches wrote:
> on 2022/12/6 19:27, Kewen.Lin via Gcc-patches wrote:
> > Hi Mike,
> > 
> > Thanks for fixing this, some comments are inlined below.
> > 
> > on 2022/11/2 10:42, Michael Meissner wrote:
> >> This patch fixes the issue that GCC cannot build when the default long 
> >> double
> >> is IEEE 128-bit.  It fails in building libgcc, specifically when it is 
> >> trying
> >> to buld the __mulkc3 function in libgcc.  It is failing in 
> >> gimple-range-fold.cc
> >> during the evrp pass.  Ultimately it is failing because the code declared 
> >> the
> >> type to use TFmode but it used F128 functions (i.e. KFmode).
> 
> By further looking into this, I found that though __float128 and _Float128 
> types
> are two different types, they have the same mode TFmode, the unexpected thing 
> is
> these two types have different precision.  I noticed it's due to the 
> "workaround"
> in build_common_tree_nodes:
> 
>   /* Work around the rs6000 KFmode having precision 113 not
>128.  */
>   const struct real_format *fmt = REAL_MODE_FORMAT (mode);
>   gcc_assert (fmt->b == 2 && fmt->emin + fmt->emax == 3);
>   int min_precision = fmt->p + ceil_log2 (fmt->emax - fmt->emin);
>   if (!extended)
>   gcc_assert (min_precision == n);
>   if (precision < min_precision)
>   precision = min_precision;
> 
> Since function useless_type_conversion_p considers two float types are 
> compatible
> if they have the same mode, so it doesn't require the explicit conversions 
> between
> these two types.  I think it's exactly what we want.  And to me, it looks 
> unexpected
> to have two types with the same mode but different precision.
> 
> So could we consider disabling the above workaround to make _Float128 have 
> the same
> precision as __float128 (long double) (the underlying TFmode)?  I tried the 
> below
> change:

The hacks with different precisions of powerpc 128-bit floating types are
very unfortunate, it is I assume because the middle-end asserted that scalar
floating point types with different modes have different precision.
We no longer assert that, as BFmode and HFmode (__bf16 and _Float16) have
the same 16-bit precision as well and e.g. C++ FE knows to treat standard
vs. extended floating point types vs. other unknown floating point types
differently in finding result type of binary operations or in which type
comparisons will be done.  That said, we'd need some target hooks to
preserve the existing behavior with __float128/__ieee128 vs. __ibm128
vs. _Float128 with both -mabi=ibmlongdouble and -mabi=ieeelongdouble.

I bet the above workaround in generic code was added for a reason, it would
surprise me if _Float128 worked at all without that hack.
Shouldn't float128_type_node be adjusted instead the same way?

Jakub



RE: [PATCH 1/2]middle-end: Support early break/return auto-vectorization.

2022-12-14 Thread Richard Biener via Gcc-patches
On Tue, 13 Dec 2022, Tamar Christina wrote:

> Hi Richi,
> 
> This is a respin of the mid-end patch.  Changes since previous version:
>  - The mismatch in Boolean types is now fixed, and it generates an OR 
> reduction when it needs to.
>  - I've refactored things around to be a bit neater
>  - I've switched to using iterate_fix_dominators which has simplified the 
> loop peeling code a ton.
>  - I've moved the conditionals into the loop structure and use them from 
> there.
>  - I've moved the analysis part early into vect_analyze_data_ref_dependences
>  - I've switched to moving the scalar code instead of the vector code, as 
> moving vector required us to track a lot more complicated things like 
> internal functions.  It was also a lot more work when the loop is unrolled or 
> VF is increased due to unpacking.  I have verified as much as I can that we 
> don't seem to run into trouble doing this.
> 
> Outstanding things:
>   - Split off the SCEV parts from the rest of the patch (and determine the 
> "normal" exit based on the counting IV instead)
>   - Merge vectorizable_early_exit and transform_early_exit
> 
> I'm sending this patch out for you to take a look at the issue we were 
> discussing the issue on IRC (which you can reproduce with testcase 
> gcc.dg/vect/vect-early-break_16.c)
> 
> That should be the last outstanding issue.   Meanwhile I'll finish up the 
> splitting of SCEV and merging the two functions. 
> 
> Any additional comments is appreciated. Will hopefully finish the refactoring 
> today and send out the split patch tomorrow.

Few comments inline.

As said in the earlier review I dislike the "normal_exit" notion and
that the loop machinery is in charge of deciding on it.
get_loop_exit_condition should be unnecessary - the vectorizer should
know the exit it considers normal.  The dumping should be also adjusted,
eventually to also dump the edge as %d -> %d.

> Thanks,
> Tamar
> 
> --- inline copy of patch ---
> 
> diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
> index 
> 528b1219bc37ad8f114d5cf381c0cff899db31ee..9c7f019a51abfe2de8e1dd7135dea2463b0256a0
>  100644
> --- a/gcc/cfgloop.h
> +++ b/gcc/cfgloop.h
> @@ -385,6 +385,7 @@ extern basic_block *get_loop_body_in_custom_order (const 
> class loop *, void *,
>  
>  extern auto_vec get_loop_exit_edges (const class loop *, basic_block * 
> = NULL);
>  extern edge single_exit (const class loop *);
> +extern edge normal_exit (const class loop *);
>  extern edge single_likely_exit (class loop *loop, const vec &);
>  extern unsigned num_loop_branches (const class loop *);
>  
> diff --git a/gcc/cfgloop.cc b/gcc/cfgloop.cc
> index 
> 57bf7b1855d4dd20fb3f42388124932d0ca2b48a..97a7373fb6d9514da602d5be01050f2ec66094bc
>  100644
> --- a/gcc/cfgloop.cc
> +++ b/gcc/cfgloop.cc
> @@ -1812,6 +1812,20 @@ single_exit (const class loop *loop)
>  return NULL;
>  }
>  
> +/* Returns the normal exit edge of LOOP, or NULL if LOOP has either no exit.
> +   If loops do not have the exits recorded, NULL is returned always.  */
> +
> +edge
> +normal_exit (const class loop *loop)
> +{
> +  struct loop_exit *exit = loop->exits->next;
> +
> +  if (!loops_state_satisfies_p (LOOPS_HAVE_RECORDED_EXITS))
> +return NULL;
> +
> +  return exit->e;
> +}
> +
>  /* Returns true when BB has an incoming edge exiting LOOP.  */
>  
>  bool
> diff --git a/gcc/doc/loop.texi b/gcc/doc/loop.texi
> index 
> 6e8657a074d2447db7ae9b75cbfbb71282b84287..e1de2ac40f87f879ab691f68bd41b3bc21a83bf7
>  100644
> --- a/gcc/doc/loop.texi
> +++ b/gcc/doc/loop.texi
> @@ -211,6 +211,10 @@ relation, and breath-first search order, respectively.
>  @item @code{single_exit}: Returns the single exit edge of the loop, or
>  @code{NULL} if the loop has more than one exit.  You can only use this
>  function if @code{LOOPS_HAVE_RECORDED_EXITS} is used.
> +function if LOOPS_HAVE_MARKED_SINGLE_EXITS property is used.
> +@item @code{normal_exit}: Returns the natural exit edge of the loop,
> +even if the loop has more than one exit.  The natural exit is the exit
> +that would normally be taken where the loop to be fully executed.
>  @item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop.
>  @item @code{just_once_each_iteration_p}: Returns true if the basic block
>  is executed exactly once during each iteration of a loop (that is, it
> @@ -623,4 +627,4 @@ maximum verbosity the details of a data dependence 
> relations array,
>  @code{dump_dist_dir_vectors} prints only the classical distance and
>  direction vectors for a data dependence relations array, and
>  @code{dump_data_references} prints the details of the data references
> -contained in a data reference array.
> +contained in a data reference array
> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> index 
> ffe69d6fcb9c46cf97ba570e85b56e586a0c9b99..a82c7b8f1efa01b02b772c9dd0f5b3dcde817091
>  100644
> --- a/gcc/doc/sourcebuild.texi
> +++ b/gcc/doc/sourcebuild.texi
> @@ -1637,6 +1637,10 @@ Target supports hardware vector

Re: [PATCH V2] rs6000: Load high and low part of 64bit constant independently

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi Jeff,

on 2022/12/12 09:44, Jiufu Guo via Gcc-patches wrote:
> Hi,
> 
> Compare with previous patch, this patch updates accoding to comments; fixes
> conflicts with trunk, and recheck bootstrap®test.
> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607333.html
> 
> For a complicate 64bit constant, blow is one instruction-sequence to
    below

> build:
>   lis 9,0x800a
>   ori 9,9,0xabcd
>   sldi 9,9,32
>   oris 9,9,0xc167
>   ori 9,9,0xfa16
> 
> while we can also use below sequence to build:
>   lis 9,0xc167
>   lis 10,0x800a
>   ori 9,9,0xfa16
>   ori 10,10,0xabcd
>   rldimi 9,10,32,0
> This sequence is using 2 registers to build high and low part firstly,
> and then merge them.
> 
> In parallel aspect, this sequence would be faster. (Ofcause, using 1 more
> register with potential register pressure).
> 
> The instruction sequence with two registers for parallel version can be
> generated only if can_create_pseudo_p.  Otherwise, the one register
> sequence is generated.
> 
> Bootstrap and regtest pass on ppc64{,le}.
> Is this ok for trunk?
> 
> 
> BR,
> Jeff(Jiufu)
> 
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Generate
>   more parallel code if can_create_pseudo_p.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/parall_5insn_const.c: New test.
> 
> ---
>  gcc/config/rs6000/rs6000.cc   | 37 +--
>  .../gcc.target/powerpc/parall_5insn_const.c   | 27 ++
>  2 files changed, 52 insertions(+), 12 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index b3a609f3aa3..3020d9780bc 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -10322,19 +10322,32 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT 
> c)
>  }
>else
>  {
> -  temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
> -
> -  emit_move_insn (temp, GEN_INT (sext_hwi (ud4 << 16, 32)));
> -  if (ud3 != 0)
> - emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud3)));
> +  if (can_create_pseudo_p ())
> + {
> +   /* lis H,U4; ori H,U3; lis L,U2; ori L,U1; rldimi L,H,32,0.  */

Nit: It's probably better to update the capitals with the actual variable names
in upper case, since they are also short, ...

> +   rtx high = gen_reg_rtx (DImode);
> +   rtx low = gen_reg_rtx (DImode);
> +   HOST_WIDE_INT num = (ud2 << 16) | ud1;
> +   rs6000_emit_set_long_const (low, sext_hwi (num, 32));
> +   num = (ud4 << 16) | ud3;
> +   rs6000_emit_set_long_const (high, sext_hwi (num, 32));
> +   emit_insn (gen_rotldi3_insert_3 (dest, high, GEN_INT (32), low,
> +GEN_INT (0x)));
> + }
> +  else
> + {
> +   /* lis A,U4; ori A,U3; rotl A,32; oris A,U2; ori A,U1.  */

... and here, the others look good to me.  Thanks!

BR,
Kewen



Re: [PATCH] rust: Fix up aarch64-linux bootstrap [PR106072]

2022-12-14 Thread Arthur Cohen

Hi Jakub,

On 12/14/22 10:14, Jakub Jelinek via Gcc-rust wrote:

Hi!

Bootstrap fails on aarch64-linux and some other arches with:
.../gcc/rust/parse/rust-parse-impl.h: In member function ‘Rust::AST::ClosureParam 
Rust::Parser::parse_closure_param() [with 
ManagedTokenSource = Rust::Lexer]’:
.../gcc/rust/parse/rust-parse-impl.h:8916:49: error: ‘this’ pointer is null 
[-Werror=nonnull]
The problem is that while say on x86_64-linux the side-effects in the
arguments are evaluated from last argument to first, on aarch64-linux
it is the other way around, from first to last.  The C++ I believe even
in C++17 makes the evaluation of those side-effects unordered
(indeterminately sequenced with no interleaving), so that is fine.
But, when the call in return statement is evaluated from first to
last, std::move (pattern) happens before pattern->get_locus () and
the former will make pattern (std::unique_ptr) a wrapper object around
nullptr, so dereferencing it later to call get_locus () on it is invalid.

The following patch fixes that, ok for trunk?

2022-12-14  Jakub Jelinek  

PR rust/106072
* parse/rust-parse-impl.h (parse_closure_param): Store
pattern->get_locus () in a temporary before std::move (pattern) is
invoked.

--- gcc/rust/parse/rust-parse-impl.h.jj 2022-12-13 16:50:12.708093521 +0100
+++ gcc/rust/parse/rust-parse-impl.h2022-12-14 09:50:31.73932 +0100
@@ -8912,8 +8912,9 @@ Parser::parse_closur
}
  }
  
-  return AST::ClosureParam (std::move (pattern), pattern->get_locus (),

-   std::move (type), std::move (outer_attrs));
+  Location loc = pattern->get_locus ();
+  return AST::ClosureParam (std::move (pattern), loc, std::move (type),
+   std::move (outer_attrs));
  }
  
  // Parses a grouped or tuple expression (disambiguates).


Jakub



Thanks :) this looks good to me. We already have that issue fixed in our 
upstream dev branch, by this PR:


https://github.com/Rust-GCC/gccrs/pull/1619

but we have yet to update GCC's master with our upstream dev branch, so 
in the meantime feel free to apply your patch. When I'll get to updating 
master, I'm expecting these kinds of tiny conflicts and we'll deal with 
them.


Thanks a lot for working on this and sorry that my tardiness in updating 
has caused a duplication of efforts.


All the best,

--
Arthur Cohen 

Toolchain Engineer

Embecosm GmbH

Geschäftsführer: Jeremy Bennett
Niederlassung: Nürnberg
Handelsregister: HR-B 36368
www.embecosm.de

Fürther Str. 27
90429 Nürnberg


Tel.: 091 - 128 707 040
Fax: 091 - 128 707 077


OpenPGP_0x1B3465B044AD9C65.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi Jakub,

Thanks for the comments!

on 2022/12/14 17:36, Jakub Jelinek wrote:
> On Wed, Dec 14, 2022 at 04:46:07PM +0800, Kewen.Lin via Gcc-patches wrote:
>> on 2022/12/6 19:27, Kewen.Lin via Gcc-patches wrote:
>>> Hi Mike,
>>>
>>> Thanks for fixing this, some comments are inlined below.
>>>
>>> on 2022/11/2 10:42, Michael Meissner wrote:
 This patch fixes the issue that GCC cannot build when the default long 
 double
 is IEEE 128-bit.  It fails in building libgcc, specifically when it is 
 trying
 to buld the __mulkc3 function in libgcc.  It is failing in 
 gimple-range-fold.cc
 during the evrp pass.  Ultimately it is failing because the code declared 
 the
 type to use TFmode but it used F128 functions (i.e. KFmode).
>>
>> By further looking into this, I found that though __float128 and _Float128 
>> types
>> are two different types, they have the same mode TFmode, the unexpected 
>> thing is
>> these two types have different precision.  I noticed it's due to the 
>> "workaround"
>> in build_common_tree_nodes:
>>
>>   /* Work around the rs6000 KFmode having precision 113 not
>>   128.  */
>>   const struct real_format *fmt = REAL_MODE_FORMAT (mode);
>>   gcc_assert (fmt->b == 2 && fmt->emin + fmt->emax == 3);
>>   int min_precision = fmt->p + ceil_log2 (fmt->emax - fmt->emin);
>>   if (!extended)
>>  gcc_assert (min_precision == n);
>>   if (precision < min_precision)
>>  precision = min_precision;
>>
>> Since function useless_type_conversion_p considers two float types are 
>> compatible
>> if they have the same mode, so it doesn't require the explicit conversions 
>> between
>> these two types.  I think it's exactly what we want.  And to me, it looks 
>> unexpected
>> to have two types with the same mode but different precision.
>>
>> So could we consider disabling the above workaround to make _Float128 have 
>> the same
>> precision as __float128 (long double) (the underlying TFmode)?  I tried the 
>> below
>> change:
> 
> The hacks with different precisions of powerpc 128-bit floating types are
> very unfortunate, it is I assume because the middle-end asserted that scalar
> floating point types with different modes have different precision.
> We no longer assert that, as BFmode and HFmode (__bf16 and _Float16) have
> the same 16-bit precision as well and e.g. C++ FE knows to treat standard
> vs. extended floating point types vs. other unknown floating point types
> differently in finding result type of binary operations or in which type
> comparisons will be done.  

It's good news, for now those three long double modes on Power have different
precisions, if they can have the same precision, I'd expect the ICE should be
gone.

> That said, we'd need some target hooks to
> preserve the existing behavior with __float128/__ieee128 vs. __ibm128
> vs. _Float128 with both -mabi=ibmlongdouble and -mabi=ieeelongdouble.
> 
> I bet the above workaround in generic code was added for a reason, it would
> surprise me if _Float128 worked at all without that hack.

OK, I'll have a look at those nan failures soon.

> Shouldn't float128_type_node be adjusted instead the same way?

Not sure, the regression testing only had nan related failures exposed.

BR,
Kewen


[Patch] mklog: only do is_binary_file check if available

2022-12-14 Thread Tobias Burnus

Ubuntu 20.04.5 LTS (focal) unfortunately has an too old unidiff.PatchSet
for the feature added on Monday.

Solution: use is_binary_file only when it is available.

OK for mainline?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
#!/usr/bin/env python3

# Copyright (C) 2020-2022 Free Software Foundation, Inc.
#
# This file is part of GCC.
#
# GCC is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3, or (at your option)
# any later version.
#
# GCC is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with GCC; see the file COPYING.  If not, write to
# the Free Software Foundation, 51 Franklin Street, Fifth Floor,
# Boston, MA 02110-1301, USA.

# This script parses a .diff file generated with 'diff -up' or 'diff -cp'
# and adds a skeleton ChangeLog file to the file. It does not try to be
# too smart when parsing function names, but it produces a reasonable
# approximation.
#
# Author: Martin Liska 

import argparse
import datetime
import json
import os
import re
import subprocess
import sys
from itertools import takewhile

import requests

from unidiff import PatchSet

LINE_LIMIT = 100
TAB_WIDTH = 8
CO_AUTHORED_BY_PREFIX = 'co-authored-by: '

pr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?PPR [a-z+-]+\/[0-9]+)')
prnum_regex = re.compile(r'PR (?P[a-z+-]+)/(?P[0-9]+)')
dr_regex = re.compile(r'(\/(\/|\*)|[Cc*!])\s+(?PDR [0-9]+)')
dg_regex = re.compile(r'{\s+dg-(error|warning)')
pr_filename_regex = re.compile(r'(^|[\W_])[Pp][Rr](?P\d{4,})')
identifier_regex = re.compile(r'^([a-zA-Z0-9_#].*)')
comment_regex = re.compile(r'^\/\*')
struct_regex = re.compile(r'^(class|struct|union|enum)\s+'
  r'(GTY\(.*\)\s+)?([a-zA-Z0-9_]+)')
macro_regex = re.compile(r'#\s*(define|undef)\s+([a-zA-Z0-9_]+)')
super_macro_regex = re.compile(r'^DEF[A-Z0-9_]+\s*\(([a-zA-Z0-9_]+)')
fn_regex = re.compile(r'([a-zA-Z_][^()\s]*)\s*\([^*]')
template_and_param_regex = re.compile(r'<[^<>]*>')
md_def_regex = re.compile(r'\(define.*\s+"(.*)"')
bugzilla_url = 'https://gcc.gnu.org/bugzilla/rest.cgi/bug?id=%s&;' \
   'include_fields=summary,component'

function_extensions = {'.c', '.cpp', '.C', '.cc', '.h', '.inc', '.def', '.md'}

# NB: Makefile.in isn't listed as it's not always generated.
generated_files = {'aclocal.m4', 'config.h.in', 'configure'}

help_message = """\
Generate ChangeLog template for PATCH.
PATCH must be generated using diff(1)'s -up or -cp options
(or their equivalent in git).
"""

script_folder = os.path.realpath(__file__)
root = os.path.dirname(os.path.dirname(script_folder))


def find_changelog(path):
folder = os.path.split(path)[0]
while True:
if os.path.exists(os.path.join(root, folder, 'ChangeLog')):
return folder
folder = os.path.dirname(folder)
if folder == '':
return folder
raise AssertionError()


def extract_function_name(line):
if comment_regex.match(line):
return None
m = struct_regex.search(line)
if m:
# Struct declaration
return m.group(1) + ' ' + m.group(3)
m = macro_regex.search(line)
if m:
# Macro definition
return m.group(2)
m = super_macro_regex.search(line)
if m:
# Supermacro
return m.group(1)
m = fn_regex.search(line)
if m:
# Discard template and function parameters.
fn = m.group(1)
fn = re.sub(template_and_param_regex, '', fn)
return fn.rstrip()
return None


def try_add_function(functions, line):
fn = extract_function_name(line)
if fn and fn not in functions:
functions.append(fn)
return bool(fn)


def sort_changelog_files(changed_file):
return (changed_file.is_added_file, changed_file.is_removed_file)


def get_pr_titles(prs):
output = []
for idx, pr in enumerate(prs):
pr_id = pr.split('/')[-1]
r = requests.get(bugzilla_url % pr_id)
bugs = r.json()['bugs']
if len(bugs) == 1:
prs[idx] = 'PR %s/%s' % (bugs[0]['component'], pr_id)
out = '%s - %s\n' % (prs[idx], bugs[0]['summary'])
if out not in output:
output.append(out)
if output:
output.append('')
return '\n'.join(output)


def append_changelog_line(out, relative_path, text):
line = f'\t* {relative_path}:'
if len(line.replace('\t', ' ' * TAB_WIDTH) + ' ' + text) <= LINE_LIMIT:
out += f'{line} {text}\n'
  

Re: [PATCH V4 1/2] rs6000: use li;x?oris to build constant

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi Jeff,

on 2022/12/12 09:38, Jiufu Guo via Gcc-patches wrote:
> Hi,
> 
> For constant C:
> If '(c & 0x8000ULL) == 0x8000ULL' or say:
> 32(1) || 16(x) || 1(1) || 15(x), using "li; xoris" would be ok.
> 
> If '(c & 0x80008000ULL) == 0x8000ULL' or say:
> 32(0) || 1(1) || 15(x) || 1(0) || 15(x), we could use "li; oris" to
> build constant 'C'.
> 
> Here N(M) means N continuous bit M, x for M means it is ok for either
> 1 or 0; '||' means concatenation.
> 
> This patch update rs6000_emit_set_long_const to support those constants.
> 
> Compare with previous version, this patch fixes conflicts with trunk.
> and put li;x?oris as the first patch (lis;xoris as the second patch).
> Previous version:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607618.html
> 
> Bootstrap and regtest pass on ppc64{,le}.
> 
> Is this ok for trunk?
> 
> BR,
> Jeff (Jiufu)
> 
> 
>   PR target/106708
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add using
>   "li; x?oris" to build constant.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/pr106708.c: New test.
> 
> ---
>  gcc/config/rs6000/rs6000.cc | 36 +++---
>  gcc/testsuite/gcc.target/powerpc/pr106708.c | 41 +
>  2 files changed, 71 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106708.c
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index b3a609f3aa3..8c1192a10c8 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -10251,17 +10251,41 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT 
> c)
>if (ud1 != 0)
>   emit_move_insn (dest, gen_rtx_IOR (DImode, temp, GEN_INT (ud1)));
>  }
> +  else if (ud4 == 0x && ud3 == 0x && (ud1 & 0x8000))
> +{
> +  /* li; xoris */
> +  temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
> +  emit_move_insn (temp, GEN_INT (sext_hwi (ud1, 16)));
> +  emit_move_insn (dest, gen_rtx_XOR (DImode, temp,
> +  GEN_INT ((ud2 ^ 0x) << 16)));
> +}
>else if (ud3 == 0 && ud4 == 0)
>  {
>temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
>  
>gcc_assert (ud2 & 0x8000);
> -  emit_move_insn (temp, GEN_INT (sext_hwi (ud2 << 16, 32)));
> -  if (ud1 != 0)
> - emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud1)));
> -  emit_move_insn (dest,
> -   gen_rtx_ZERO_EXTEND (DImode,
> -gen_lowpart (SImode,temp)));
> +
> +  if (ud1 == 0)
> + {
> +   /* lis; rldicl */
> +   emit_move_insn (temp, GEN_INT (sext_hwi (ud2 << 16, 32)));
> +   emit_move_insn (dest,
> +   gen_rtx_AND (DImode, temp, GEN_INT (0x)));
> + }
> +  else if (!(ud1 & 0x8000))
> + {
> +   /* li; oris */
> +   emit_move_insn (temp, GEN_INT (ud1));
> +   emit_move_insn (dest,
> +   gen_rtx_IOR (DImode, temp, GEN_INT (ud2 << 16)));
> + }
> +  else
> + {

Nit: Add "/* lis; ori; rldicl */" like the other arms?

> +   emit_move_insn (temp, GEN_INT (sext_hwi (ud2 << 16, 32)));
> +   emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud1)));
> +   emit_move_insn (dest,
> +   gen_rtx_AND (DImode, temp, GEN_INT (0x)));
> + }
>  }
>else if (ud1 == ud3 && ud2 == ud4)
>  {
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr106708.c 
> b/gcc/testsuite/gcc.target/powerpc/pr106708.c
> new file mode 100644
> index 000..dc9ceda8367
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr106708.c
> @@ -0,0 +1,41 @@
> +/* PR target/106708 */
> +/* { dg-do run } */
> +/* { dg-options "-O2 -mno-prefixed -save-temps" } */
> +/* { dg-require-effective-target has_arch_ppc64 } */
> +
> +long long arr[]
> +  = {0x7cdeab55LL, 0x98765432LL, 0xabcdLL};
> +
> +void __attribute__ ((__noipa__)) lixoris (long long *arg)

Nit: Adding separator "_" to make the name like "li_xoris" or even
"test_li_xoris" seems better to read.  Also applied for the other
function names "lioris" and "lisrldicl".

The others look good to me.  Thanks!

BR,
Kewen


Re: [PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299

2022-12-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Dec 14, 2022 at 06:11:26PM +0800, Kewen.Lin wrote:
> > The hacks with different precisions of powerpc 128-bit floating types are
> > very unfortunate, it is I assume because the middle-end asserted that scalar
> > floating point types with different modes have different precision.
> > We no longer assert that, as BFmode and HFmode (__bf16 and _Float16) have
> > the same 16-bit precision as well and e.g. C++ FE knows to treat standard
> > vs. extended floating point types vs. other unknown floating point types
> > differently in finding result type of binary operations or in which type
> > comparisons will be done.  
> 
> It's good news, for now those three long double modes on Power have different
> precisions, if they can have the same precision, I'd expect the ICE should be
> gone.

I'm talking mainly about r13-3292, the asserts now check different modes
have different precision unless it is half vs. brain or vice versa, but
could be changed further, but if the precision is the same, the FEs
and the middle-end needs to know how to deal with those.
For C++23, say when __ibm128 is the same as long double and _Float128 is
supported, the 2 types are unordered (neither is a subset or superset of
the other because there are many _Float128 values one can't represent
in double double (whether anything with exponent larger than what double
can represent or most of the more precise values), but because of the
variable precision there are double double values that can't be represented
in _Float128 either) and so we can error on comparisons of those or on
arithmetics with such arguments (unless explicitly converted first).
But for backwards compatibility we can't do that for __float128 vs. __ibm128
and so need to backwards compatibly decide what wins.  And for the
middle-end say even for mode conversions decide what is widening and what is
narrowing even when they are unordered.

Jakub



[PATCH (pushed)] mklog: do not depend on recent unidiff version

2022-12-14 Thread Martin Liška
contrib/ChangeLog:

* mklog.py: Check for number of hunks and not if a modified
file is binary.
---
 contrib/mklog.py | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/contrib/mklog.py b/contrib/mklog.py
index 358b7fc6b8b..5dea8a05c0c 100755
--- a/contrib/mklog.py
+++ b/contrib/mklog.py
@@ -186,8 +186,9 @@ def generate_changelog(data, no_functions=False, 
fill_pr_titles=False,
 # contains commented code which a note that it
 # has not been tested due to a certain PR or DR.
 this_file_prs = []
-if not file.is_binary_file:
-for line in list(file)[0][0:10]:
+hunks = list(file)
+if hunks:
+for line in hunks[0][0:10]:
 m = pr_regex.search(line.value)
 if m:
 pr = m.group('pr')
-- 
2.38.1



RE: [PATCH] rust: Fix up aarch64-linux bootstrap [PR106072]

2022-12-14 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Arthur Cohen
> Sent: Wednesday, December 14, 2022 10:05 AM
> To: Jakub Jelinek ; gcc-r...@gcc.gnu.org; gcc-
> patc...@gcc.gnu.org
> Subject: Re: [PATCH] rust: Fix up aarch64-linux bootstrap [PR106072]
> 
> Hi Jakub,
> 
> On 12/14/22 10:14, Jakub Jelinek via Gcc-rust wrote:
> > Hi!
> >
> > Bootstrap fails on aarch64-linux and some other arches with:
> > .../gcc/rust/parse/rust-parse-impl.h: In member function
> ‘Rust::AST::ClosureParam
> Rust::Parser::parse_closure_param() [with
> ManagedTokenSource = Rust::Lexer]’:
> > .../gcc/rust/parse/rust-parse-impl.h:8916:49: error: ‘this’ pointer is null 
> > [-
> Werror=nonnull]
> > The problem is that while say on x86_64-linux the side-effects in the
> > arguments are evaluated from last argument to first, on aarch64-linux
> > it is the other way around, from first to last.  The C++ I believe even
> > in C++17 makes the evaluation of those side-effects unordered
> > (indeterminately sequenced with no interleaving), so that is fine.
> > But, when the call in return statement is evaluated from first to
> > last, std::move (pattern) happens before pattern->get_locus () and
> > the former will make pattern (std::unique_ptr) a wrapper object around
> > nullptr, so dereferencing it later to call get_locus () on it is invalid.
> >
> > The following patch fixes that, ok for trunk?

FWIW, with this patch my aarch64 bootstrap progressed past the previous point 
of failure (it's currently in stage 3).
Thanks,
Kyrill

> >
> > 2022-12-14  Jakub Jelinek  
> >
> > PR rust/106072
> > * parse/rust-parse-impl.h (parse_closure_param): Store
> > pattern->get_locus () in a temporary before std::move (pattern) is
> > invoked.
> >
> > --- gcc/rust/parse/rust-parse-impl.h.jj 2022-12-13
> 16:50:12.708093521 +0100
> > +++ gcc/rust/parse/rust-parse-impl.h2022-12-14 09:50:31.73932
> +0100
> > @@ -8912,8 +8912,9 @@ Parser::parse_closur
> > }
> >   }
> >
> > -  return AST::ClosureParam (std::move (pattern), pattern->get_locus (),
> > -   std::move (type), std::move (outer_attrs));
> > +  Location loc = pattern->get_locus ();
> > +  return AST::ClosureParam (std::move (pattern), loc, std::move (type),
> > +   std::move (outer_attrs));
> >   }
> >
> >   // Parses a grouped or tuple expression (disambiguates).
> >
> > Jakub
> >
> 
> Thanks :) this looks good to me. We already have that issue fixed in our
> upstream dev branch, by this PR:
> 
> https://github.com/Rust-GCC/gccrs/pull/1619
> 
> but we have yet to update GCC's master with our upstream dev branch, so
> in the meantime feel free to apply your patch. When I'll get to updating
> master, I'm expecting these kinds of tiny conflicts and we'll deal with
> them.
> 
> Thanks a lot for working on this and sorry that my tardiness in updating
> has caused a duplication of efforts.
> 
> All the best,
> 
> --
> Arthur Cohen 
> 
> Toolchain Engineer
> 
> Embecosm GmbH
> 
> Geschäftsführer: Jeremy Bennett
> Niederlassung: Nürnberg
> Handelsregister: HR-B 36368
> www.embecosm.de
> 
> Fürther Str. 27
> 90429 Nürnberg
> 
> 
> Tel.: 091 - 128 707 040
> Fax: 091 - 128 707 077


[Patch] Fortran/OpenMP: Add parsing support for allocators directive

2022-12-14 Thread Tobias Burnus

This patch adds parsing/argument-checking support for
  '!$omp allocators allocate([align(int),allocator(a) :] list)'

This is kind of logical follow-up and prep patch for the
  '!$omp allocate(list) [align(v) allocator(a)]'
support that was submitted as part of a larger patchset by Abid; cf.
review at
  "[PATCH 1/5] [gfortran] Add parsing support for allocate directive (OpenMP 
5.0)."
  https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603258.html

My follow-up patch will add parsing support for declarative/executable '!$omp 
allocate'.

OK for mainline?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran/OpenMP: Add parsing support for allocators directive

gcc/fortran/ChangeLog:

	* gfortran.h (enum gfc_statement): Add ST_OMP_ALLOCATORS and
	ST_OMP_END_ALLOCATORS.
	(enum gfc_exec_op): Add EXEC_OMP_ALLOCATORS.
	* dump-parse-tree.cc (show_omp_node, show_code_node): Handle
	OpenMP's ALLOCATORS directive.
	* match.h (gfc_match_omp_allocators): New prototype.
	* openmp.cc (OMP_ALLOCATORS_CLAUSES): Define.
	(gfc_match_omp_allocators): New.
	(resolve_omp_clauses, omp_code_to_statement,
	gfc_resolve_omp_directive): Handle EXEC_OMP_ALLOCATORS.
	* parse.cc (parse_openmp_allocate_block): New.
	(case_exec_markers): Add ST_OMP_ALLOCATORS.
	(decode_omp_directive, gfc_ascii_statement,
	parse_executable): Parse OpenMP allocators directive.
	* resolve.cc (gfc_resolve_blocks): Handle EXEC_OMP_ALLOCATORS.
	* st.cc (gfc_free_statement): Likewise.
	* trans.cc (trans_code): Likewise.
	* trans-openmp.cc (gfc_trans_omp_directive): Show 'sorry' for
	EXEC_OMP_ALLOCATORS.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/allocators-1.f90: New test.
	* gfortran.dg/gomp/allocators-2.f90: New test.

 gcc/fortran/dump-parse-tree.cc  |  2 +
 gcc/fortran/gfortran.h  |  3 +-
 gcc/fortran/match.h |  1 +
 gcc/fortran/openmp.cc   | 31 ++-
 gcc/fortran/parse.cc| 50 -
 gcc/fortran/resolve.cc  |  2 +
 gcc/fortran/st.cc   |  1 +
 gcc/fortran/trans-openmp.cc |  3 ++
 gcc/fortran/trans.cc|  1 +
 gcc/testsuite/gfortran.dg/gomp/allocators-1.f90 | 28 ++
 gcc/testsuite/gfortran.dg/gomp/allocators-2.f90 | 22 +++
 11 files changed, 140 insertions(+), 4 deletions(-)

diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc
index 5ae72dc1cac..4565b71c758 100644
--- a/gcc/fortran/dump-parse-tree.cc
+++ b/gcc/fortran/dump-parse-tree.cc
@@ -2081,6 +2081,7 @@ show_omp_node (int level, gfc_code *c)
 case EXEC_OACC_CACHE: name = "CACHE"; is_oacc = true; break;
 case EXEC_OACC_ENTER_DATA: name = "ENTER DATA"; is_oacc = true; break;
 case EXEC_OACC_EXIT_DATA: name = "EXIT DATA"; is_oacc = true; break;
+case EXEC_OMP_ALLOCATORS: name = "ALLOCATORS"; break;
 case EXEC_OMP_ASSUME: name = "ASSUME"; break;
 case EXEC_OMP_ATOMIC: name = "ATOMIC"; break;
 case EXEC_OMP_BARRIER: name = "BARRIER"; break;
@@ -3409,6 +3410,7 @@ show_code_node (int level, gfc_code *c)
 case EXEC_OACC_CACHE:
 case EXEC_OACC_ENTER_DATA:
 case EXEC_OACC_EXIT_DATA:
+case EXEC_OMP_ALLOCATORS:
 case EXEC_OMP_ASSUME:
 case EXEC_OMP_ATOMIC:
 case EXEC_OMP_CANCEL:
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 5f8a81ae4a1..63f38d2 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -318,6 +318,7 @@ enum gfc_statement
   ST_OMP_END_MASKED_TASKLOOP, ST_OMP_MASKED_TASKLOOP_SIMD,
   ST_OMP_END_MASKED_TASKLOOP_SIMD, ST_OMP_SCOPE, ST_OMP_END_SCOPE,
   ST_OMP_ERROR, ST_OMP_ASSUME, ST_OMP_END_ASSUME, ST_OMP_ASSUMES,
+  ST_OMP_ALLOCATORS, ST_OMP_END_ALLOCATORS,
   /* Note: gfc_match_omp_nothing returns ST_NONE. */
   ST_OMP_NOTHING, ST_NONE
 };
@@ -2959,7 +2960,7 @@ enum gfc_exec_op
   EXEC_OMP_TARGET_TEAMS_LOOP, EXEC_OMP_MASKED, EXEC_OMP_PARALLEL_MASKED,
   EXEC_OMP_PARALLEL_MASKED_TASKLOOP, EXEC_OMP_PARALLEL_MASKED_TASKLOOP_SIMD,
   EXEC_OMP_MASKED_TASKLOOP, EXEC_OMP_MASKED_TASKLOOP_SIMD, EXEC_OMP_SCOPE,
-  EXEC_OMP_ERROR
+  EXEC_OMP_ERROR, EXEC_OMP_ALLOCATORS
 };
 
 typedef struct gfc_code
diff --git a/gcc/fortran/match.h b/gcc/fortran/match.h
index 2a805815d9c..b1f5db80125 100644
--- a/gcc/fortran/match.h
+++ b/gcc/fortran/match.h
@@ -149,6 +149,7 @@ match gfc_match_oacc_routine (void);
 
 /* OpenMP directive matchers.  */
 match gfc_match_omp_eos_error (void);
+match gfc_match_omp_allocators (void);
 match gfc_match_omp_assume (void);
 match gfc_match_omp_assumes (void);
 match gfc_match_omp_atomic (void);
diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index 686f924b47a..e978f8774

RE: [PATCH]AArch64 div-by-255, ensure that arguments are registers. [PR107988]

2022-12-14 Thread Tamar Christina via Gcc-patches
> -Original Message-
> From: Richard Sandiford 
> Sent: Friday, December 9, 2022 7:08 AM
> To: Richard Earnshaw 
> Cc: Tamar Christina ; gcc-patches@gcc.gnu.org;
> nd ; Richard Earnshaw ;
> Marcus Shawcroft ; Kyrylo Tkachov
> 
> Subject: Re: [PATCH]AArch64 div-by-255, ensure that arguments are
> registers. [PR107988]
> 
> Richard Earnshaw  writes:
> > On 08/12/2022 16:39, Tamar Christina via Gcc-patches wrote:
> >> Hi All,
> >>
> >> At -O0 (as opposed to e.g. volatile) we can get into the situation
> >> where the
> >> in0 and result RTL arguments passed to the division function are
> >> memory locations instead of registers.  I think we could reject these
> >> early on by checking that the gimple values are GIMPLE registers, but
> >> I think it's better to handle it.
> >>
> >> As such I force them to registers and emit a move to the memory
> >> locations and leave it up to reload to handle.  This fixes the ICE
> >> and still allows the optimization in these cases,  which improves the code
> quality a lot.
> >>
> >> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >>
> >> Ok for master?
> >>
> >> Thanks,
> >> Tamar
> >>
> >>
> >>
> >> gcc/ChangeLog:
> >>
> >>PR target/107988
> >>* config/aarch64/aarch64.cc
> >>(aarch64_vectorize_can_special_div_by_constant): Ensure input and
> output
> >>RTL are registers.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >>PR target/107988
> >>* gcc.target/aarch64/pr107988-1.c: New test.
> >>
> >> --- inline copy of patch --
> >> diff --git a/gcc/config/aarch64/aarch64.cc
> >> b/gcc/config/aarch64/aarch64.cc index
> >>
> b8dc3f070c8afc47c85fa18768c4da92c774338f..9f96424993c4fe90e1b241f
> >> cb3aa97025225 100644
> >> --- a/gcc/config/aarch64/aarch64.cc
> >> +++ b/gcc/config/aarch64/aarch64.cc
> >> @@ -24337,12 +24337,27 @@
> aarch64_vectorize_can_special_div_by_constant (enum tree_code code,
> >> if (!VECTOR_TYPE_P (vectype))
> >>  return false;
> >>
> >> +  if (!REG_P (in0))
> >> +in0 = force_reg (GET_MODE (in0), in0);
> >> +
> >> gcc_assert (output);
> >>
> >> -  if (!*output)
> >> -*output = gen_reg_rtx (TYPE_MODE (vectype));
> >> +  rtx res =  NULL_RTX;
> >> +
> >> +  /* Once e get to this point we cannot reject the RTL,  if it's not a reg
> then
> >> + Create a new reg and write the result to the output afterwards.
> >> + */  if (!*output || !REG_P (*output))
> >> +res = gen_reg_rtx (TYPE_MODE (vectype));  else
> >> +res = *output;
> >
> > Why not write
> >rtx res = *output
> >if (!res || !REG_P (res))
> >  res = gen_reg_rtx...
> >
> > then you don't need either the else clause or the dead NULL_RTX
> assignment.
> 
> I'd prefer that we use the expand_insn interface, which already has logic for
> coercing inputs and outputs to predicates.  Something like:
> 
>   machine_mode mode = TYPE_MODE (vectype);
>   unsigned int flags = aarch64_classify_vector_mode (mode);
>   if ((flags & VEC_ANY_SVE) && !TARGET_SVE2)
> return false;
> 
>   ...
> 
>   expand_operand ops[3];
>   create_output_operand (&ops[0], *output, mode);
>   create_input_operand (&ops[1], in0, mode);
>   create_fixed_operand (&ops[2], in1);
>   expand_insn (insn_code, 3, ops);
>   *output = ops[0].value;
>   return true;
> 
> On this function: why do we have the VECTOR_TYPE_P condition in:
> 

It was left over after checking for optabs support. It's superfluous now.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

PR target/107988
* config/aarch64/aarch64.cc
(aarch64_vectorize_can_special_div_by_constant): Ensure input and output
RTL are registers.

gcc/testsuite/ChangeLog:

PR target/107988
* gcc.target/aarch64/pr107988-1.c: New test.

--- inline copy of patch ---

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
7bb0b7602ff6474410059494dd86b7be1621dde5..e1f34ef5da170ef11727e0c99a5bd42849c5d185
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -24395,7 +24395,8 @@ aarch64_vectorize_can_special_div_by_constant (enum 
tree_code code,
   || !TYPE_UNSIGNED (vectype))
 return false;
 
-  unsigned int flags = aarch64_classify_vector_mode (TYPE_MODE (vectype));
+  machine_mode mode = TYPE_MODE (vectype);
+  unsigned int flags = aarch64_classify_vector_mode (mode);
   if ((flags & VEC_ANY_SVE) && !TARGET_SVE2)
 return false;
 
@@ -24411,15 +24412,14 @@ aarch64_vectorize_can_special_div_by_constant (enum 
tree_code code,
   if (in0 == NULL_RTX && in1 == NULL_RTX)
 return true;
 
-  if (!VECTOR_TYPE_P (vectype))
-   return false;
-
   gcc_assert (output);
 
-  if (!*output)
-*output = gen_reg_rtx (TYPE_MODE (vectype));
-
-  emit_insn (gen_aarch64_bitmask_udiv3 (TYPE_MODE (vectype), *output, in0, 
in1));
+  expand_operand ops[3];
+  create_output_operand (&ops[0], *output, mode);
+  create_input_operand (&ops[1], in0, mode);
+  crea

[PATCH] rs6000: Raise error for __vector_{quad, pair} uses without MMA enabled [PR106736]

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi,

As PR106736 shows, it's unexpected to use __vector_quad and
__vector_pair types without MMA support, it would cause ICE
when expanding the corresponding assignment.  We can't guard
these built-in types registering under MMA support as Peter
pointed out in that PR, because the registering is global,
it doesn't work for target pragma/attribute support with MMA
enabled.  The existing verify_type_context mentioned in [2]
can help to make the diagnostics invalid built-in type uses
better, but as Richard pointed out in [4], it can't deal with
all cases.  As the discussions in [1][3], this patch is to
check the invalid use of built-in types __vector_quad and
__vector_pair in mov pattern of OOmode and XOmode, on the
currently being expanded gimple assignment statement.  It
still puts an assertion in else arm rather than just makes
it go through, it's to ensure we can catch any other possible
unexpected cases in time if there are.

Bootstrapped and regtested on powerpc64-linux-gnu P8,
powerpc64le-linux-gnu P9 and P10.

I'm going to push this next week if no objections.

[1] https://gcc.gnu.org/pipermail/gcc/2022-December/240218.html
[2] https://gcc.gnu.org/pipermail/gcc/2022-December/240220.html
[3] https://gcc.gnu.org/pipermail/gcc/2022-December/240223.html
[4] https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608083.html

BR,
Kewen
-
PR target/106736

gcc/ChangeLog:

* config/rs6000/mma.md (define_expand movoo): Call function
rs6000_opaque_type_invalid_use_p to check and emit error message for
the invalid use of opaque type.
(define_expand movxo): Likewise.
* config/rs6000/rs6000-protos.h
(rs6000_opaque_type_invalid_use_p): New function declaration.
(currently_expanding_gimple_stmt): New extern declaration.
* config/rs6000/rs6000.cc (rs6000_opaque_type_invalid_use_p): New
function.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106736-1.c: New test.
* gcc.target/powerpc/pr106736-2.c: Likewise.
* gcc.target/powerpc/pr106736-3.c: Likewise.
* gcc.target/powerpc/pr106736-4.c: Likewise.
* gcc.target/powerpc/pr106736-5.c: Likewise.
---
 gcc/config/rs6000/mma.md  | 10 -
 gcc/config/rs6000/rs6000-protos.h |  2 +
 gcc/config/rs6000/rs6000.cc   | 39 ++-
 gcc/testsuite/gcc.target/powerpc/pr106736-1.c | 20 ++
 gcc/testsuite/gcc.target/powerpc/pr106736-2.c | 17 
 gcc/testsuite/gcc.target/powerpc/pr106736-3.c | 18 +
 gcc/testsuite/gcc.target/powerpc/pr106736-4.c | 19 +
 gcc/testsuite/gcc.target/powerpc/pr106736-5.c | 18 +
 8 files changed, 140 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106736-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106736-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106736-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106736-4.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106736-5.c

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index 032f4263cb0..f2952a3c3be 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -285,8 +285,11 @@ (define_expand "movoo"
 expanding to RTL and have seen errors.  It would not cause further ICEs
 as the compilation would stop soon after expanding.  */
 }
+  else if (rs6000_opaque_type_invalid_use_p (currently_expanding_gimple_stmt))
+;
   else
-gcc_unreachable ();
+/* Catch unexpected cases.  */
+gcc_assert (false);
 })

 (define_insn_and_split "*movoo"
@@ -329,8 +332,11 @@ (define_expand "movxo"
 some missing required conditions.  So do the same handlings for XOmode
 as OOmode here.  */
 }
+  else if (rs6000_opaque_type_invalid_use_p (currently_expanding_gimple_stmt))
+;
   else
-gcc_unreachable ();
+/* Catch unexpected cases.  */
+gcc_assert (false);
 })

 (define_insn_and_split "*movxo"
diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index d0d89320ef6..d6cf6f87f54 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -344,4 +344,6 @@ extern rtx rs6000_gen_lvx (enum machine_mode, rtx, rtx);
 extern rtx rs6000_gen_stvx (enum machine_mode, rtx, rtx);

 extern void rs6000_emit_xxspltidp_v2df (rtx, long value);
+extern gimple *currently_expanding_gimple_stmt;
+extern bool rs6000_opaque_type_invalid_use_p (gimple *);
 #endif  /* rs6000-protos.h */
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index eb7ad5e954f..70e93c54483 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -28834,7 +28834,44 @@ constant_generates_xxspltidp (vec_const_128bit_type 
*vsx_const)
   return sf_value;
 }

-

+/* Now we have only two opaque types, they are __vector_quad and
+   __vector_pair built-in types.  They are target specific and
+   o

PING^1 [PATCH 0/9] rs6000: Rework rs6000_emit_vector_compare

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi,

Gentle ping this series:

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html

BR,
Kewen

on 2022/11/24 17:15, Kewen Lin wrote:
> Hi,
> 
> Following Segher's suggestion, this patch series is to rework
> function rs6000_emit_vector_compare for vector float and int
> in multiple steps, it's based on the previous attempts [1][2].
> As mentioned in [1], the need to rework this for float is to
> make a centralized place for vector float comparison handlings
> instead of supporting with swapping ops and reversing code etc.
> dispersedly.  It's also for a subsequent patch to handle
> comparison operators with or without trapping math (PR105480).
> With the handling on vector float reworked, we can further make
> the handling on vector int simplified as shown.
> 
> For Segher's concern about whether this rework causes any
> assembly change, I constructed two testcases for vector float[3]
> and int[4] respectively before, it showed the most are fine
> excepting for the difference on LE and UNGT, it's demonstrated
> as improvement since it uses GE instead of GT ior EQ.  The
> associated test case in patch 3/9 is a good example.
> 
> Besides, w/ and w/o the whole patch series, I built the whole
> SPEC2017 at options -O3 and -Ofast separately, checked the
> differences on object assembly.  The result showed that the
> most are unchanged, except for:
> 
>   * at -O3, 521.wrf_r has 9 object files and 526.blender_r has
> 9 object files with differences.
> 
>   * at -Ofast, 521.wrf_r has 12 object files, 526.blender_r has
> one and 527.cam4_r has 4 object files with differences.
> 
> By looking into these differences, all significant differences
> are caused by the known improvement mentined above transforming
> GT ior EQ to GE, which can also affect unrolling decision due
> to insn count.  Some other trivial differences are branch
> target offset difference, nop difference for alignment, vsx
> register number differences etc.
> 
> I also evaluated the runtime performance for these changed
> benchmarks, the result is neutral.
> 
> These patches are bootstrapped and regress-tested
> incrementally on powerpc64-linux-gnu P7 & P8, and
> powerpc64le-linux-gnu P9 & P10.
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> -
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606375.html
> [2] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606376.html
> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606504.html
> [4] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606506.html
> 
> Kewen Lin (9):
>   rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p1
>   rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p2
>   rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p3
>   rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p4
>   rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p1
>   rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p2
>   rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p3
>   rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p4
>   rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p5
> 
>  gcc/config/rs6000/rs6000.cc | 180 ++--
>  gcc/testsuite/gcc.target/powerpc/vcond-fp.c |  25 +++
>  2 files changed, 74 insertions(+), 131 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/vcond-fp.c
> 


PING^2 [PATCH v2] rs6000: Rework option -mpowerpc64 handling [PR106680]

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi,

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603350.html

BR,
Kewen

> on 2022/10/12 16:12, Kewen.Lin via Gcc-patches wrote:
>> Hi,
>>
>> PR106680 shows that -m32 -mpowerpc64 is different from
>> -mpowerpc64 -m32, this is determined by the way how we
>> handle option powerpc64 in rs6000_handle_option.
>>
>> Segher pointed out this difference should be taken as
>> a bug and we should ensure that option powerpc64 is
>> independent of -m32/-m64.  So this patch removes the
>> handlings in rs6000_handle_option and add some necessary
>> supports in rs6000_option_override_internal instead.
>>
>> With this patch, if users specify -m{no-,}powerpc64, the
>> specified value is honoured, otherwise, for 64bit it
>> always enables OPTION_MASK_POWERPC64; while for 32bit
>> and TARGET_POWERPC64 and OS_MISSING_POWERPC64, it disables
>> OPTION_MASK_POWERPC64.
>>
>> btw, following Segher's suggestion, I did some tries to warn
>> when OPTION_MASK_POWERPC64 is set for OS_MISSING_POWERPC64.
>> If warn for the case that powerpc64 is specified explicitly,
>> there are some TCs using -m32 -mpowerpc64 on ppc64-linux,
>> they need some updates, meanwhile the artificial run
>> with "--target_board=unix'{-m32/-mpowerpc64}'" will have
>> noisy warnings on ppc64-linux.  If warn for the case that
>> it's specified implicitly, they can just be initialized by
>> TARGET_DEFAULT (like -m32 on ppc64-linux) or set from the 
>> given cpu mask, we have to special case them and not to warn.
>> As Segher's latest comment, I decide not to warn them and
>> keep it consistent with before.
>>
>> Bootstrapped and regress-tested on:
>>   - powerpc64-linux-gnu P7 and P8 {-m64,-m32}
>>   - powerpc64le-linux-gnu P9 and P10
>>   - powerpc-ibm-aix7.2.0.0 {-maix64,-maix32}
>>
>> Hi Iain, could you help to test this new patch on darwin
>> again?  Thanks in advance!
>>
>> Is it ok for trunk if darwin testing goes well?
>>
>


PING^1 [PATCH] rs6000: Fix some issues related to Power10 fusion [PR104024]

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi,

Gentle ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607526.html

BR,
Kewen

on 2022/11/30 16:30, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> As PR104024 shows, the option -mpower10-fusion isn't guarded by
> -mcpu=power10, it causes compiler to fuse for some patterns
> even without power10 support and then causes ICE unexpectedly,
> this patch is to simply unmask it without power10 support, not
> emit any warnings as this option is undocumented.
> 
> Besides, for some define_insns in fusion.md which use constraint
> v, it requires the condition VECTOR_UNIT_ALTIVEC_OR_VSX_P
> (mode), otherwise it can cause ICE in reload, see test
> case pr104024-2.c.
> 
> Bootstrapped and regtested on powerpc64-linux-gnu P8,
> powerpc64le-linux-gnu P9 and P10.
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> -
>   PR target/104024
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.cc (rs6000_option_override_internal): Disable
>   TARGET_P10_FUSION if !TARGET_POWER10.
>   * config/rs6000/fusion.md: Regenerate.
>   * config/rs6000/genfusion.pl: Add the check for define_insns
>   with constraint v.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/pr104024-1.c: New test.
>   * gcc.target/powerpc/pr104024-2.c: New test.
> ---
>  gcc/config/rs6000/fusion.md   | 130 +-
>  gcc/config/rs6000/genfusion.pl|  12 +-
>  gcc/config/rs6000/rs6000.cc   |  11 +-
>  gcc/testsuite/gcc.target/powerpc/pr104024-1.c |  16 +++
>  gcc/testsuite/gcc.target/powerpc/pr104024-2.c |  18 +++
>  5 files changed, 113 insertions(+), 74 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr104024-1.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr104024-2.c
> 
> diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
> index 15f0c16f705..c504f65a045 100644
> --- a/gcc/config/rs6000/fusion.md
> +++ b/gcc/config/rs6000/fusion.md
> @@ -1875,7 +1875,7 @@ (define_insn "*fuse_vand_vand"
>(match_operand:VM 1 "altivec_register_operand" 
> "%v,v,v,v"))
>   (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
> (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
> -  "(TARGET_P10_FUSION)"
> +  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && TARGET_P10_FUSION)"
>"@
> vand %3,%1,%0\;vand %3,%3,%2
> vand %3,%1,%0\;vand %3,%3,%2
> @@ -1893,7 +1893,7 @@ (define_insn "*fuse_vandc_vand"
>(match_operand:VM 1 "altivec_register_operand" 
> "v,v,v,v"))
>   (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
> (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
> -  "(TARGET_P10_FUSION)"
> +  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && TARGET_P10_FUSION)"
>"@
> vandc %3,%1,%0\;vand %3,%3,%2
> vandc %3,%1,%0\;vand %3,%3,%2
> @@ -1911,7 +1911,7 @@ (define_insn "*fuse_veqv_vand"
>(match_operand:VM 1 "altivec_register_operand" 
> "v,v,v,v")))
>   (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
> (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
> -  "(TARGET_P10_FUSION)"
> +  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && TARGET_P10_FUSION)"
>"@
> veqv %3,%1,%0\;vand %3,%3,%2
> veqv %3,%1,%0\;vand %3,%3,%2
> @@ -1929,7 +1929,7 @@ (define_insn "*fuse_vnand_vand"
>(not:VM (match_operand:VM 1 
> "altivec_register_operand" "v,v,v,v")))
>   (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
> (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
> -  "(TARGET_P10_FUSION)"
> +  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && TARGET_P10_FUSION)"
>"@
> vnand %3,%1,%0\;vand %3,%3,%2
> vnand %3,%1,%0\;vand %3,%3,%2
> @@ -1947,7 +1947,7 @@ (define_insn "*fuse_vnor_vand"
>(not:VM (match_operand:VM 1 
> "altivec_register_operand" "v,v,v,v")))
>   (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
> (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
> -  "(TARGET_P10_FUSION)"
> +  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && TARGET_P10_FUSION)"
>"@
> vnor %3,%1,%0\;vand %3,%3,%2
> vnor %3,%1,%0\;vand %3,%3,%2
> @@ -1965,7 +1965,7 @@ (define_insn "*fuse_vor_vand"
>(match_operand:VM 1 "altivec_register_operand" 
> "v,v,v,v"))
>   (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
> (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
> -  "(TARGET_P10_FUSION)"
> +  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && TARGET_P10_FUSION)"
>"@
> vor %3,%1,%0\;vand %3,%3,%2
> vor %3,%1,%0\;vand %3,%3,%2
> @@ -1983,7 +1983,7 @@ (define_insn "*fuse_vorc_vand"
>(match_operand:VM 1 "altivec_register_operand" 
> "v,v,v,v"))
>   (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
> (clobber (match_scratch:VM 4 "=X,X,X,&v"))]
> -  "(TARGET_P10_FUSION)"
> +  "(VECTOR_UNIT_AL

PING^1 [PATCH v2] predict: Adjust optimize_function_for_size_p [PR105818]

2022-12-14 Thread Kewen.Lin via Gcc-patches
Hi,

Gentle ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607527.html

BR,
Kewen

on 2022/11/30 16:30, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> Function optimize_function_for_size_p returns OPTIMIZE_SIZE_NO
> if fun->decl is not null but no cgraph node is available for it.
> As PR105818 shows, this could cause unexpected consequence.  For
> the case in PR105818, when parsing bar decl in function foo, the
> cfun is the function structure for foo, for which there is no
> cgraph node, so it returns OPTIMIZE_SIZE_NO.  But it's incorrect
> since the context is to optimize for size, the flag optimize_size
> is true.
> 
> The patch is to make optimize_function_for_size_p to check
> opt_for_fn (fun->decl, optimize_size) further when fun->decl
> is available but no cgraph node, it's just like what function
> cgraph_node::optimize_for_size_p does at its first step.
> 
> One regression failure got exposed on aarch64-linux-gnu:
> 
> PASS->FAIL: gcc.dg/guality/pr54693-2.c   -Os \
>   -DPREVENT_OPTIMIZATION  line 21 x == 10 - i
> 
> The difference comes from the macro LOGICAL_OP_NON_SHORT_CIRCUIT
> used in function fold_range_test during c parsing, it uses
> optimize_function_for_speed_p which is equal to the invertion
> of optimize_function_for_size_p.  At that time cfun->decl is valid
> but no cgraph node for it, w/o this patch function
> optimize_function_for_speed_p returns true eventually, while it
> returns false with this patch.  Since the command line option -Os
> is specified, there is no reason to interpret it as "for speed".
> I think this failure is expected and adjust the test case
> accordingly.
> 
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596628.html
> 
> Comparing with v1, v2 adopts opt_for_fn (fun->decl, optimize_size)
> instead of optimize_size as Honza's previous comments.
> 
> Besides, the reply to Honza's question "Why exactly PR105818 hits
> the flag change issue?" was at the link:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596667.html
> 
> Bootstrapped and regtested on x86_64-redhat-linux,
> aarch64-linux-gnu and powerpc64{,le}-linux-gnu.
> 
> Is it for trunk?
> 
> BR,
> Kewen
> -
>   PR middle-end/105818
> 
> gcc/ChangeLog:
> 
>   * predict.cc (optimize_function_for_size_p): Further check
>   optimize_size of fun->decl when it is valid but no cgraph node.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/pr105818.c: New test.
>   * gcc.dg/guality/pr54693-2.c: Adjust for aarch64.
> ---
>  gcc/predict.cc  |  3 ++-
>  gcc/testsuite/gcc.dg/guality/pr54693-2.c|  2 +-
>  gcc/testsuite/gcc.target/powerpc/pr105818.c | 11 +++
>  3 files changed, 14 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr105818.c
> 
> diff --git a/gcc/predict.cc b/gcc/predict.cc
> index 1bc7ab94454..ecb4aabc9df 100644
> --- a/gcc/predict.cc
> +++ b/gcc/predict.cc
> @@ -268,7 +268,8 @@ optimize_function_for_size_p (struct function *fun)
>cgraph_node *n = cgraph_node::get (fun->decl);
>if (n)
>  return n->optimize_for_size_p ();
> -  return OPTIMIZE_SIZE_NO;
> +  return opt_for_fn (fun->decl, optimize_size) ? OPTIMIZE_SIZE_MAX
> +: OPTIMIZE_SIZE_NO;
>  }
> 
>  /* Return true if function FUN should always be optimized for speed.  */
> diff --git a/gcc/testsuite/gcc.dg/guality/pr54693-2.c 
> b/gcc/testsuite/gcc.dg/guality/pr54693-2.c
> index 68aa6c63d71..14ca94ad37d 100644
> --- a/gcc/testsuite/gcc.dg/guality/pr54693-2.c
> +++ b/gcc/testsuite/gcc.dg/guality/pr54693-2.c
> @@ -17,7 +17,7 @@ foo (int x, int y, int z)
>int i = 0;
>while (x > 3 && y > 3 && z > 3)
>  {/* { dg-final { gdb-test .+2 "i" "v + 1" } } */
> - /* { dg-final { gdb-test .+1 "x" "10 - i" } } */
> + /* { dg-final { gdb-test .+1 "x" "10 - i" { xfail { 
> aarch64*-*-* && { any-opts "-Os" } } } } } */
>bar (i);   /* { dg-final { gdb-test . "y" "20 - 2 * i" } } */
>   /* { dg-final { gdb-test .-1 "z" "30 - 3 * i" { xfail { 
> aarch64*-*-* && { any-opts "-fno-fat-lto-objects" "-Os" } } } } } */
>i++, x--, y -= 2, z -= 3;
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr105818.c 
> b/gcc/testsuite/gcc.target/powerpc/pr105818.c
> new file mode 100644
> index 000..679647e189d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr105818.c
> @@ -0,0 +1,11 @@
> +/* { dg-options "-Os -fno-tree-vectorize" } */
> +
> +/* Verify there is no ICE.  */
> +
> +#pragma GCC optimize "-fno-tree-vectorize"
> +
> +void
> +foo (void)
> +{
> +  void bar (void);
> +}
> --
> 2.27.0


[PATCH] RISC-V: Remove unit-stride store from ta attribute

2022-12-14 Thread juzhe . zhong
From: Ju-Zhe Zhong 

Since store instructions doesn't care about tail policy, we remove 
vste from "ta" attribute. Hence, we could have more fusion chances
and better optimization.

gcc/ChangeLog:

* config/riscv/vector.md: Remove vste.

---
 gcc/config/riscv/vector.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 7dfadaa96b6..84adbb9974a 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -221,7 +221,7 @@
 
 ;; The tail policy op value.
 (define_attr "ta" ""
-  (cond [(eq_attr "type" "vlde,vste,vimov,vfmov,vlds")
+  (cond [(eq_attr "type" "vlde,vimov,vfmov,vlds")
   (symbol_ref "riscv_vector::get_ta(operands[5])")]
(const_int INVALID_ATTRIBUTE)))
 
-- 
2.36.3



Re: [PATCH 6/9] ipa-sra: Be optimistic about Fortran descriptors

2022-12-14 Thread Martin Jambor
Hi,

On Mon, Dec 12 2022, Jan Hubicka wrote:
>> Hi,
>> 
>> I'm re-posting patches which I have posted at the end of stage 1 but
>> which have not passed review yet.
>> 
>> 8<
>> 
>> Fortran descriptors are structures which are often constructed just
>> for a particular argument of a particular call where it is passed by
>> reference.  When the called function is under compiler's control, it
>> can be beneficial to split up the descriptor and pass it in individual
>> parameters.  Unfortunately, currently we allow IPA-SRA to replace a
>> pointer with a set of replacements which are at most twice as big in
>> total and for descriptors we'd need to bump that factor to seven.
>> 
>> This patch looks for parameters which are ADDR_EXPRs of local
>> variables which are written to and passed as arguments by reference
>> but are never loaded from and marks them with a flag in the call edge
>> summary.  The IPA analysis phase then identifies formal parameters
>> which are always fed such arguments and then is more lenient when it
>> comoes to size.
>> 
>> In order not to store to maximums per formal parameter, I calculate
>> the more lenient one by multiplying the existing one with a new
>> parameter.  If it is preferable to keep the maximums independent, we
>> can do that.  Documentation for the new parameter is missing as I
>> still need to re-base the patch on a version which has sphinx.  I will
>> write it before committing.
>> 
>> I have disable IPA-SRA in pr48636-2.f90 in order to be able to keep
>> using its dump-scan expressions.  The new testcase is basically a copy
>> of it with different options and IPA-SRA dump scans.
>> 
>> Bootstrapped and tested individually when I originally posted it and
>> now bootstrapped and LTO-bootstrapped and tested as part of the whole
>> series.  OK for master?
>> 
>> 
>> gcc/ChangeLog:
>> 
>> 2022-11-11  Martin Jambor  
>> 
>>  * ipa-sra.c (isra_param_desc): New field not_specially_constructed.
>>  (struct isra_param_flow): New field constructed_for_calls.
>>  (isra_call_summary::dump): Dump the new flag.
>>  (loaded_decls): New variable.
>>  (dump_isra_param_descriptor): New parameter hints, dump
>>  not_specially_constructed if it is true.
>>  (dump_isra_param_descriptors): New parameter hints, pass it to
>>  dump_isra_param_descriptor.
>>  (ipa_sra_function_summaries::duplicate): Duplicate new flag.
>>  (create_parameter_descriptors): Adjust comment.
>>  (get_gensum_param_desc): Bail out when decl2desc is NULL.
>>  (scan_expr_access): Add loaded local variables to loaded_decls.
>>  (scan_function): Survive if final_bbs is NULL.
>>  (isra_analyze_call): Compute constructed_for_calls flag.
>>  (process_scan_results): Be optimistic about size limits.  Do not dump
>>  computed param hints when dumpint IPA-SRA structures.
>>  (isra_write_edge_summary): Stream constructed_for_calls.
>>  (isra_read_edge_summary): Likewise.
>>  (ipa_sra_dump_all_summaries): New parameter hints, pass it to
>>  dump_isra_param_descriptor.
>>  (flip_all_hints_pessimistic): New function.
>>  (flip_all_param_hints_pessimistic): Likewise.
>>  (propagate_param_hints): Likewise.
>>  (disable_unavailable_parameters): Renamed to
>>  adjust_parameter_descriptions.  Expand size limits for parameters
>>  which are specially contstructed by all callers.  Check limits again.p
>>  (ipa_sra_analysis): Pass required hints to ipa_sra_dump_all_summaries.
>>  Add hint propagation.
>>  (ipa_sra_summarize_function): Initialize and destory loaded_decls,
>>  rearrange so that scan_function is called even when there are no
>>  candidates.
>>  * params.opt (ipa-sra-ptrwrap-growth-factor): New parameter.
>
> Hmm, this is quite specific heuristics, but I do not have much better
> ideas, so it is OK :)
>

Yeah, it kind of is.  I was wondering whether it should really only
target Fortran array descriptors (and have them marked some way) but
eventually decided for this - but I do not expect the code to trigger
too much for non-Fortran code.

> Can this be useful also for inlining?

IPA-SRA deallocates its summaries after the analysis phase so we'd need
to postpone that for later.  Otherwise it should be quite directly
usable - perhaps after a check that the "hint propagation" bit of
IPA-SRA has been run, the flag starts with the optimistic value.  

Martin


[PATCH (pushed)] contrib: add copyright for my scripts

2022-12-14 Thread Martin Liška
Hi.

The Copyright year will be updated automatically with a next patch
I'm going to send.

Cheers,
Martin

contrib/ChangeLog:

* analyze_brprob.py: Add copyright header.
* analyze_brprob_spec.py: Likewise.
* check-params-in-docs.py: Likewise.
* check_GNU_style.py: Likewise.
* check_GNU_style_lib.py: Likewise.
* filter-clang-warnings.py: Likewise.
* gcc-changelog/git_check_commit.py: Likewise.
* gcc-changelog/git_commit.py: Likewise.
* gcc-changelog/git_email.py: Likewise.
* gcc-changelog/git_repository.py: Likewise.
* gcc-changelog/git_update_version.py: Likewise.
* gcc-changelog/test_email.py: Likewise.
* mark_spam.py: Likewise.
---
 contrib/analyze_brprob.py   | 2 ++
 contrib/analyze_brprob_spec.py  | 2 ++
 contrib/check-params-in-docs.py | 2 ++
 contrib/check_GNU_style.py  | 2 ++
 contrib/check_GNU_style_lib.py  | 2 ++
 contrib/filter-clang-warnings.py| 2 ++
 contrib/gcc-changelog/git_check_commit.py   | 2 ++
 contrib/gcc-changelog/git_commit.py | 2 ++
 contrib/gcc-changelog/git_email.py  | 2 ++
 contrib/gcc-changelog/git_repository.py | 2 ++
 contrib/gcc-changelog/git_update_version.py | 2 ++
 contrib/gcc-changelog/test_email.py | 2 ++
 contrib/mark_spam.py| 2 ++
 13 files changed, 26 insertions(+)

diff --git a/contrib/analyze_brprob.py b/contrib/analyze_brprob.py
index debc9a6421a..d5a8031e75c 100755
--- a/contrib/analyze_brprob.py
+++ b/contrib/analyze_brprob.py
@@ -1,4 +1,6 @@
 #!/usr/bin/env python3
+
+# Copyright (C) 2016 Free Software Foundation, Inc.
 #
 # Script to analyze results of our branch prediction heuristics
 #
diff --git a/contrib/analyze_brprob_spec.py b/contrib/analyze_brprob_spec.py
index c7a9ae07e16..8f7dcbaddb4 100755
--- a/contrib/analyze_brprob_spec.py
+++ b/contrib/analyze_brprob_spec.py
@@ -1,5 +1,7 @@
 #!/usr/bin/env python3
 
+# Copyright (C) 2016 Free Software Foundation, Inc.
+#
 # This file is part of GCC.
 #
 # GCC is free software; you can redistribute it and/or modify it under
diff --git a/contrib/check-params-in-docs.py b/contrib/check-params-in-docs.py
index d57055088b7..8f8f6654df3 100755
--- a/contrib/check-params-in-docs.py
+++ b/contrib/check-params-in-docs.py
@@ -1,4 +1,6 @@
 #!/usr/bin/env python3
+
+# Copyright (C) 2018 Free Software Foundation, Inc.
 #
 # Find missing and extra parameters in documentation compared to
 # output of: gcc --help=params.
diff --git a/contrib/check_GNU_style.py b/contrib/check_GNU_style.py
index 969534a3cc9..826d17abf08 100755
--- a/contrib/check_GNU_style.py
+++ b/contrib/check_GNU_style.py
@@ -1,4 +1,6 @@
 #!/usr/bin/env python3
+
+# Copyright (C) 2017 Free Software Foundation, Inc.
 #
 # Checks some of the GNU style formatting rules in a set of patches.
 # The script is a rewritten of the same bash script and should eventually
diff --git a/contrib/check_GNU_style_lib.py b/contrib/check_GNU_style_lib.py
index b3db4fbddc9..3d709d1eafa 100755
--- a/contrib/check_GNU_style_lib.py
+++ b/contrib/check_GNU_style_lib.py
@@ -1,4 +1,6 @@
 #!/usr/bin/env python3
+
+# Copyright (C) 2017 Free Software Foundation, Inc.
 #
 # Checks some of the GNU style formatting rules in a set of patches.
 # The script is a rewritten of the same bash script and should eventually
diff --git a/contrib/filter-clang-warnings.py b/contrib/filter-clang-warnings.py
index 3c68be028a8..c426bce5eb5 100755
--- a/contrib/filter-clang-warnings.py
+++ b/contrib/filter-clang-warnings.py
@@ -1,4 +1,6 @@
 #!/usr/bin/env python3
+
+# Copyright (C) 2018 Free Software Foundation, Inc.
 #
 # Script to analyze warnings produced by clang.
 #
diff --git a/contrib/gcc-changelog/git_check_commit.py 
b/contrib/gcc-changelog/git_check_commit.py
index d6aff3cef91..2e3e8cbeb77 100755
--- a/contrib/gcc-changelog/git_check_commit.py
+++ b/contrib/gcc-changelog/git_check_commit.py
@@ -1,4 +1,6 @@
 #!/usr/bin/env python3
+
+# Copyright (C) 2020 Free Software Foundation, Inc.
 #
 # This file is part of GCC.
 #
diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index d90e6c19b76..66d68de03a5 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -1,4 +1,6 @@
 #!/usr/bin/env python3
+
+# Copyright (C) 2020 Free Software Foundation, Inc.
 #
 # This file is part of GCC.
 #
diff --git a/contrib/gcc-changelog/git_email.py 
b/contrib/gcc-changelog/git_email.py
index 2566d4149e7..ef50ebfb7fd 100755
--- a/contrib/gcc-changelog/git_email.py
+++ b/contrib/gcc-changelog/git_email.py
@@ -1,4 +1,6 @@
 #!/usr/bin/env python3
+
+# Copyright (C) 2020 Free Software Foundation, Inc.
 #
 # This file is part of GCC.
 #
diff --git a/contrib/gcc-changelog/git_repository.py 
b/contrib/gcc-changelog/git_repository.py
index 2d688826ff8..7c2dc218775 100755
--- a/contrib/gcc-changelog/git_repository.py
+++ b/co

[PATCH] contrib: add contrib to update-copyright.py script

2022-12-14 Thread Martin Liška
Hi.

I would like to automatically update copyright in contrib folder.
The updated version of copyright can be seen in attachment and can be done
at the beginning of the next year.

Thoughts?

Cheers,
Martin

contrib/ChangeLog:

* update-copyright.py: Add contrib folder.
---
 contrib/update-copyright.py | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/contrib/update-copyright.py b/contrib/update-copyright.py
index 76614ffc4ea..5c4fc512488 100755
--- a/contrib/update-copyright.py
+++ b/contrib/update-copyright.py
@@ -680,6 +680,18 @@ class LibStdCxxFilter (GenericFilter):
 return re.compile ('// \(C\) Copyright Jeremy Siek')
 return GenericFilter.get_line_filter (self, dir, filename)
 
+class ContribFilder(GenericFilter):
+def __init__ (self):
+GenericFilter.__init__ (self)
+
+self.skip_files |= set ([
+# A different copyrights.
+'unicode-license.txt',
+'Info.plist',
+# Contains CR (^M).
+'repro_fail',
+])
+
 class GCCCopyright (Copyright):
 def __init__ (self, errors):
 Copyright.__init__ (self, errors)
@@ -699,6 +711,7 @@ class GCCCopyright (Copyright):
 self.add_external_author ('Advanced Micro Devices Inc.')
 self.add_external_author ('Ami Tavory and Vladimir Dreizin, IBM-HRL.')
 self.add_external_author ('Cavium Networks.')
+self.add_external_author ('David Malcolm')
 self.add_external_author ('Faraday Technology Corp.')
 self.add_external_author ('Florida State University')
 self.add_external_author ('Gerard Jungman')
@@ -738,7 +751,7 @@ class GCCCmdLine (CmdLine):
 # boehm-gc is imported from upstream.
 self.add_dir ('c++tools')
 self.add_dir ('config', ConfigFilter())
-# contrib isn't really part of GCC.
+self.add_dir ('contrib', ContribFilder())
 self.add_dir ('fixincludes')
 self.add_dir ('gcc', GCCFilter())
 self.add_dir (os.path.join ('gcc', 'testsuite'), TestsuiteFilter())
-- 
2.38.1
From e3752f59397746b3d558e7f7438302dac67d57b5 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 14 Dec 2022 14:08:04 +0100
Subject: [PATCH] Update copyright years.

---
 contrib/analyze_brprob.py | 2 +-
 contrib/analyze_brprob_spec.py| 2 +-
 contrib/bench-stringop| 2 +-
 contrib/check-params-in-docs.py   | 2 +-
 contrib/check_GNU_style.py| 2 +-
 contrib/check_GNU_style.sh| 2 +-
 contrib/check_GNU_style_lib.py| 2 +-
 contrib/check_makefile_deps.sh| 2 +-
 contrib/check_warning_flags.sh| 2 +-
 contrib/clang-format  | 2 +-
 contrib/compare-all-tests | 2 +-
 contrib/compare-debug | 2 +-
 contrib/compare-lto   | 2 +-
 contrib/compareSumTests3  | 2 +-
 contrib/compare_two_ftime_report_sets | 2 +-
 contrib/dg-cmp-results.sh | 2 +-
 contrib/dg-extract-results.py | 2 +-
 contrib/dg-extract-results.sh | 2 +-
 contrib/dglib.pm  | 2 +-
 contrib/download_prerequisites| 2 +-
 contrib/filter-clang-warnings.py  | 2 +-
 contrib/gcc-changelog/git_check_commit.py | 2 +-
 contrib/gcc-changelog/git_commit.py   | 2 +-
 contrib/gcc-changelog/git_email.py| 2 +-
 contrib/gcc-changelog/git_repository.py   | 2 +-
 contrib/gcc-changelog/git_update_version.py   | 2 +-
 contrib/gcc-changelog/test_email.py   | 2 +-
 contrib/gcc_build | 2 +-
 contrib/gen_autofdo_event.py  | 2 +-
 contrib/git-backport.py   | 2 +-
 contrib/git-commit-mklog.py   | 2 +-
 contrib/git-fix-changelog.py  | 2 +-
 contrib/jit-coverage-report.py| 2 +-
 contrib/legacy/mklog  | 2 +-
 contrib/legacy/mklog.pl   | 2 +-
 contrib/mark_spam.py  | 2 +-
 contrib/mklog.py  | 2 +-
 contrib/patch_tester.sh   | 2 +-
 contrib/prepare-commit-msg| 2 +-
 contrib/prepare_patch.sh  | 2 +-
 contrib/reghunt/bin/gcc-build-full   

Re: [PATCH (pushed)] contrib: add copyright for my scripts

2022-12-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Dec 14, 2022 at 02:07:38PM +0100, Martin Liška wrote:
> Hi.
> 
> The Copyright year will be updated automatically with a next patch
> I'm going to send.

Shouldn't that be added as 2016-2022 etc.?

Jakub



Re: [PATCH] contrib: add contrib to update-copyright.py script

2022-12-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Dec 14, 2022 at 02:10:32PM +0100, Martin Liška wrote:
> +class ContribFilder(GenericFilter):

s/Filder/Filter/g ?

> +def __init__ (self):
> +GenericFilter.__init__ (self)
> +
> +self.skip_files |= set ([
> +# A different copyrights.
> +'unicode-license.txt',
> +'Info.plist',
> +# Contains CR (^M).
> +'repro_fail',
> +])
> +
>  class GCCCopyright (Copyright):
>  def __init__ (self, errors):
>  Copyright.__init__ (self, errors)
> @@ -699,6 +711,7 @@ class GCCCopyright (Copyright):
>  self.add_external_author ('Advanced Micro Devices Inc.')
>  self.add_external_author ('Ami Tavory and Vladimir Dreizin, 
> IBM-HRL.')
>  self.add_external_author ('Cavium Networks.')
> +self.add_external_author ('David Malcolm')
>  self.add_external_author ('Faraday Technology Corp.')
>  self.add_external_author ('Florida State University')
>  self.add_external_author ('Gerard Jungman')
> @@ -738,7 +751,7 @@ class GCCCmdLine (CmdLine):
>  # boehm-gc is imported from upstream.
>  self.add_dir ('c++tools')
>  self.add_dir ('config', ConfigFilter())
> -# contrib isn't really part of GCC.
> +self.add_dir ('contrib', ContribFilder())
>  self.add_dir ('fixincludes')
>  self.add_dir ('gcc', GCCFilter())
>  self.add_dir (os.path.join ('gcc', 'testsuite'), TestsuiteFilter())

Jakub



Re: [PATCH (pushed)] contrib: add copyright for my scripts

2022-12-14 Thread Martin Liška
On 12/14/22 14:11, Jakub Jelinek wrote:
> On Wed, Dec 14, 2022 at 02:07:38PM +0100, Martin Liška wrote:
>> Hi.
>>
>> The Copyright year will be updated automatically with a next patch
>> I'm going to send.
> 
> Shouldn't that be added as 2016-2022 etc.?

Let it be updated automatically, then we can check if it contains X-2023 ;)

Cheers,
Martin

> 
>   Jakub
> 



Re: [PATCH 8/9] ipa-sra: Make scan_expr_access bail out on uninteresting expressions

2022-12-14 Thread Martin Jambor
Hi,

On Tue, Dec 13 2022, Richard Biener wrote:
> On Mon, 12 Dec 2022, Jan Hubicka wrote:
>
>> > > Hi,
>> > > 
>> > > I'm re-posting patches which I have posted at the end of stage 1 but
>> > > which have not passed review yet.
>> > > 
>> > > 8<
>> > > 
>> > > I have noticed that scan_expr_access passes all the expressions it
>> > > gets to get_ref_base_and_extent even when we are really only
>> > > interested in memory accesses.  So bail out when the expression is
>> > > something clearly uninteresting.
>> > > 
>> > > Bootstrapped and tested individually when I originally posted it and
>> > > now bootstrapped and LTO-bootstrapped and tested as part of the whole
>> > > series.  OK for master?
>> > > 
>> > > 
>> > > gcc/ChangeLog:
>> > > 
>> > > 2021-12-14  Martin Jambor  
>> > > 
>> > >  * ipa-sra.c (scan_expr_access): Bail out early if expr is something we
>> > >  clearly do not need to pass to get_ref_base_and_extent.
>> > > ---
>> > >  gcc/ipa-sra.cc | 5 +
>> > >  1 file changed, 5 insertions(+)
>> > > 
>> > > diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
>> > > index 93fceeafc73..3646d71468c 100644
>> > > --- a/gcc/ipa-sra.cc
>> > > +++ b/gcc/ipa-sra.cc
>> > > @@ -1748,6 +1748,11 @@ scan_expr_access (tree expr, gimple *stmt, 
>> > > isra_scan_context ctx,
>> > >|| TREE_CODE (expr) == REALPART_EXPR)
>> > >  expr = TREE_OPERAND (expr, 0);
>> > >  
>> > > +  if (!handled_component_p (expr)
>> > > +  && !DECL_P (expr)
>> > > +  && TREE_CODE (expr) != MEM_REF)
>> > > +return;
>> > Is this needed because get_ref_base_and_extend crashes if given SSA_NAME
>> > or something else or is it just optimization?
>> > Perhaps Richi will know if there is better test for this.
>
> Also the code preceeding the above
>
>   if (TREE_CODE (expr) == BIT_FIELD_REF
>   || TREE_CODE (expr) == IMAGPART_EXPR
>   || TREE_CODE (expr) == REALPART_EXPR)
> expr = TREE_OPERAND (expr, 0); 
>
> but get_ref_base_and_extent shouldn't crash on anything here.  The 
> question is what you want 'expr' to be?  The comment of the function
> says CTX specifies that, but doesn't constrain the CALL case (does
> it have to be a memory argument)?
>
> With allowing handled_component_p but above not handling
> VIEW_CONVERT_EXPR you leave the possibility of VIEW_CONVERT_EXPR (d_1)
> slipping through.  Since the non-memory cases will have at most
> one wrapping handled_component get_ref_base_and_extent should be
> reasonably cheap, so maybe just cut off SSA_NAME, ADDR_EXPR and
> CONSTANT_CLASS_P at the start of the function?
>

The patch was intended just as a simple optimization in order not to run
get_ref_base_and_extent on stuff where one can see from the top-most
tree they the result won't be interesting.  Indeed it looks like
get_ref_base_and_extent does not really need this when run on non-loads.

I'll think about the function a bit more but it seems like the patch
just is not really necessary.

Thanks,

Martin



Re: PING^1 [PATCH v2] predict: Adjust optimize_function_for_size_p [PR105818]

2022-12-14 Thread Jan Hubicka via Gcc-patches
> > PR middle-end/105818
> > 
> > gcc/ChangeLog:
> > 
> > * predict.cc (optimize_function_for_size_p): Further check
> > optimize_size of fun->decl when it is valid but no cgraph node.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * gcc.target/powerpc/pr105818.c: New test.
> > * gcc.dg/guality/pr54693-2.c: Adjust for aarch64.
> > diff --git a/gcc/testsuite/gcc.target/powerpc/pr105818.c 
> > b/gcc/testsuite/gcc.target/powerpc/pr105818.c
> > new file mode 100644
> > index 000..679647e189d
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr105818.c
> > @@ -0,0 +1,11 @@
> > +/* { dg-options "-Os -fno-tree-vectorize" } */
> > +
> > +/* Verify there is no ICE.  */
> > +
> > +#pragma GCC optimize "-fno-tree-vectorize"
> > +
> > +void
> > +foo (void)
> > +{
> > +  void bar (void);
> > +}
So the testcase starts with optimize_size set but then it switches to
optimize_size==0 due to the GCC optimize pragma.  I think this is
behaviour Martin wants to change, so perhaps the testcase should be
written with explicit -O2.

I also wonder what happen when you add the attribute later?
/* { dg-options "-Os -fno-tree-vectorize" } */

/* Verify there is no ICE.  */

#pragma GCC optimize "-fno-tree-vectorize"

void
foo (void)
{
  void bar (void);
}

__attribute__ ((optimize("-fno-tree-vectorize"))) void foo (void);

I think we should generally avoid doing decisions about size/speed
optimizations so early since the setting may change due to attribtes or
profile feedback...

Honza


[PATCH] libgccjit: Fix a failing test

2022-12-14 Thread Guillaume Gomez via Gcc-patches
Hi,

This fixes bug 107999.

Thanks in advance for the review.
From e6db0cb107e54789095f4585dd279a2c984d2ca1 Mon Sep 17 00:00:00 2001
From: Guillaume Gomez 
Date: Wed, 14 Dec 2022 14:28:22 +0100
Subject: [PATCH] Fix a failing test by updating its error string.

gcc/testsuite/ChangeLog:

	* jit.dg/test-error-array-bounds.c: Update test.
---
 gcc/testsuite/jit.dg/test-error-array-bounds.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/jit.dg/test-error-array-bounds.c b/gcc/testsuite/jit.dg/test-error-array-bounds.c
index b6c0ee526d4..a0dead13cb7 100644
--- a/gcc/testsuite/jit.dg/test-error-array-bounds.c
+++ b/gcc/testsuite/jit.dg/test-error-array-bounds.c
@@ -70,5 +70,5 @@ verify_code (gcc_jit_context *ctxt, gcc_jit_result *result)
   /* ...and that the message was captured by the API.  */
   CHECK_STRING_VALUE (gcc_jit_context_get_first_error (ctxt),
 		  "array subscript 10 is above array bounds of"
-		  " 'char[10]' [-Warray-bounds]");
+		  " 'char[10]' [-Warray-bounds=]");
 }
-- 
2.34.1



Re: [PATCH]AArch64 div-by-255, ensure that arguments are registers. [PR107988]

2022-12-14 Thread Richard Sandiford via Gcc-patches
Tamar Christina  writes:
>> -Original Message-
>> From: Richard Sandiford 
>> Sent: Friday, December 9, 2022 7:08 AM
>> To: Richard Earnshaw 
>> Cc: Tamar Christina ; gcc-patches@gcc.gnu.org;
>> nd ; Richard Earnshaw ;
>> Marcus Shawcroft ; Kyrylo Tkachov
>> 
>> Subject: Re: [PATCH]AArch64 div-by-255, ensure that arguments are
>> registers. [PR107988]
>> 
>> Richard Earnshaw  writes:
>> > On 08/12/2022 16:39, Tamar Christina via Gcc-patches wrote:
>> >> Hi All,
>> >>
>> >> At -O0 (as opposed to e.g. volatile) we can get into the situation
>> >> where the
>> >> in0 and result RTL arguments passed to the division function are
>> >> memory locations instead of registers.  I think we could reject these
>> >> early on by checking that the gimple values are GIMPLE registers, but
>> >> I think it's better to handle it.
>> >>
>> >> As such I force them to registers and emit a move to the memory
>> >> locations and leave it up to reload to handle.  This fixes the ICE
>> >> and still allows the optimization in these cases,  which improves the code
>> quality a lot.
>> >>
>> >> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>> >>
>> >> Ok for master?
>> >>
>> >> Thanks,
>> >> Tamar
>> >>
>> >>
>> >>
>> >> gcc/ChangeLog:
>> >>
>> >>   PR target/107988
>> >>   * config/aarch64/aarch64.cc
>> >>   (aarch64_vectorize_can_special_div_by_constant): Ensure input and
>> output
>> >>   RTL are registers.
>> >>
>> >> gcc/testsuite/ChangeLog:
>> >>
>> >>   PR target/107988
>> >>   * gcc.target/aarch64/pr107988-1.c: New test.
>> >>
>> >> --- inline copy of patch --
>> >> diff --git a/gcc/config/aarch64/aarch64.cc
>> >> b/gcc/config/aarch64/aarch64.cc index
>> >>
>> b8dc3f070c8afc47c85fa18768c4da92c774338f..9f96424993c4fe90e1b241f
>> >> cb3aa97025225 100644
>> >> --- a/gcc/config/aarch64/aarch64.cc
>> >> +++ b/gcc/config/aarch64/aarch64.cc
>> >> @@ -24337,12 +24337,27 @@
>> aarch64_vectorize_can_special_div_by_constant (enum tree_code code,
>> >> if (!VECTOR_TYPE_P (vectype))
>> >>  return false;
>> >>
>> >> +  if (!REG_P (in0))
>> >> +in0 = force_reg (GET_MODE (in0), in0);
>> >> +
>> >> gcc_assert (output);
>> >>
>> >> -  if (!*output)
>> >> -*output = gen_reg_rtx (TYPE_MODE (vectype));
>> >> +  rtx res =  NULL_RTX;
>> >> +
>> >> +  /* Once e get to this point we cannot reject the RTL,  if it's not a 
>> >> reg
>> then
>> >> + Create a new reg and write the result to the output afterwards.
>> >> + */  if (!*output || !REG_P (*output))
>> >> +res = gen_reg_rtx (TYPE_MODE (vectype));  else
>> >> +res = *output;
>> >
>> > Why not write
>> >rtx res = *output
>> >if (!res || !REG_P (res))
>> >  res = gen_reg_rtx...
>> >
>> > then you don't need either the else clause or the dead NULL_RTX
>> assignment.
>> 
>> I'd prefer that we use the expand_insn interface, which already has logic for
>> coercing inputs and outputs to predicates.  Something like:
>> 
>>   machine_mode mode = TYPE_MODE (vectype);
>>   unsigned int flags = aarch64_classify_vector_mode (mode);
>>   if ((flags & VEC_ANY_SVE) && !TARGET_SVE2)
>> return false;
>> 
>>   ...
>> 
>>   expand_operand ops[3];
>>   create_output_operand (&ops[0], *output, mode);
>>   create_input_operand (&ops[1], in0, mode);
>>   create_fixed_operand (&ops[2], in1);
>>   expand_insn (insn_code, 3, ops);
>>   *output = ops[0].value;
>>   return true;
>> 
>> On this function: why do we have the VECTOR_TYPE_P condition in:
>> 
>
> It was left over after checking for optabs support. It's superfluous now.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?

OK, thanks.

Richard

> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>   PR target/107988
>   * config/aarch64/aarch64.cc
>   (aarch64_vectorize_can_special_div_by_constant): Ensure input and output
>   RTL are registers.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/107988
>   * gcc.target/aarch64/pr107988-1.c: New test.
>
> --- inline copy of patch ---
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> 7bb0b7602ff6474410059494dd86b7be1621dde5..e1f34ef5da170ef11727e0c99a5bd42849c5d185
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -24395,7 +24395,8 @@ aarch64_vectorize_can_special_div_by_constant (enum 
> tree_code code,
>|| !TYPE_UNSIGNED (vectype))
>  return false;
>  
> -  unsigned int flags = aarch64_classify_vector_mode (TYPE_MODE (vectype));
> +  machine_mode mode = TYPE_MODE (vectype);
> +  unsigned int flags = aarch64_classify_vector_mode (mode);
>if ((flags & VEC_ANY_SVE) && !TARGET_SVE2)
>  return false;
>  
> @@ -24411,15 +24412,14 @@ aarch64_vectorize_can_special_div_by_constant (enum 
> tree_code code,
>if (in0 == NULL_RTX && in1 == NULL_RTX)
>  return true;
>  
> -  if (!VECTOR_TYPE_P (vectype))
> -   return false;
> -
>gcc_assert (output);
>  
> -  if (!*output)
> -*output = gen_reg_rtx (T

Re: PING^1 [PATCH v2] predict: Adjust optimize_function_for_size_p [PR105818]

2022-12-14 Thread Martin Liška
On 12/14/22 14:22, Jan Hubicka via Gcc-patches wrote:
>>> PR middle-end/105818
>>>
>>> gcc/ChangeLog:
>>>
>>> * predict.cc (optimize_function_for_size_p): Further check
>>> optimize_size of fun->decl when it is valid but no cgraph node.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/powerpc/pr105818.c: New test.
>>> * gcc.dg/guality/pr54693-2.c: Adjust for aarch64.
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr105818.c 
>>> b/gcc/testsuite/gcc.target/powerpc/pr105818.c
>>> new file mode 100644
>>> index 000..679647e189d
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr105818.c
>>> @@ -0,0 +1,11 @@
>>> +/* { dg-options "-Os -fno-tree-vectorize" } */
>>> +
>>> +/* Verify there is no ICE.  */
>>> +
>>> +#pragma GCC optimize "-fno-tree-vectorize"
>>> +
>>> +void
>>> +foo (void)
>>> +{
>>> +  void bar (void);
>>> +}

Hi.

Next time, please CC me if you cite me.

> So the testcase starts with optimize_size set but then it switches to
> optimize_size==0 due to the GCC optimize pragma.  I think this is
> behaviour Martin wants to change, so perhaps the testcase should be
> written with explicit -O2.

No, the pragma does not modify optimize_size as "optimize" attribute behaves
as documented:

```
...
The optimize attribute arguments of a function behave behave as if appended to 
the command-line.
```

Martin

> 
> I also wonder what happen when you add the attribute later?
> /* { dg-options "-Os -fno-tree-vectorize" } */
> 
> /* Verify there is no ICE.  */
> 
> #pragma GCC optimize "-fno-tree-vectorize"
> 
> void
> foo (void)
> {
>   void bar (void);
> }
> 
> __attribute__ ((optimize("-fno-tree-vectorize"))) void foo (void);
> 
> I think we should generally avoid doing decisions about size/speed
> optimizations so early since the setting may change due to attribtes or
> profile feedback...
> 
> Honza



Ping^2: [PATCH] d: Update __FreeBSD_version values [PR107469]

2022-12-14 Thread Lorenzo Salvadore via Gcc-patches
Hello,

Ping https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605685.html

I would like to remind that Gerald Pfeifer already volunteered to commit this 
patch
when it is approved. However the patch has not been approved yet.

Thanks,

Lorenzo Salvadore

> --- Original Message ---
> On Friday, November 11th, 2022 at 12:07 AM, Lorenzo Salvadore 
> develo...@lorenzosalvadore.it wrote:
> 
> > Update __FreeBSD_version values for the latest FreeBSD supported
> > versions. In particular, add __FreeBSD_version for FreeBSD 14, which is
> > necessary to compile libphobos successfully on FreeBSD 14.
> > 
> > The patch has already been applied successfully in the official FreeBSD
> > ports tree for the ports lang/gcc11 and lang/gcc11-devel. Please see the
> > following commits:
> > 
> > https://cgit.freebsd.org/ports/commit/?id=f61fb49b2e76fd4f7a5b7a11510b5109206c19f2
> > https://cgit.freebsd.org/ports/commit/?id=57936dba89ea208e5dbc1bd2d7fda3d29a1838b3
> > 
> > libphobos/ChangeLog:
> > 
> > 2022-11-10 Lorenzo Salvadore develo...@lorenzosalvadore.it
> > 
> > PR d/107469.
> > * libdruntime/core/sys/freebsd/config.d: Update __FreeBSD_version.
> > 
> > ---
> > libphobos/libdruntime/core/sys/freebsd/config.d | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/libphobos/libdruntime/core/sys/freebsd/config.d 
> > b/libphobos/libdruntime/core/sys/freebsd/config.d
> > index 5e3129e2422..9d502e52e32 100644
> > --- a/libphobos/libdruntime/core/sys/freebsd/config.d
> > +++ b/libphobos/libdruntime/core/sys/freebsd/config.d
> > @@ -14,8 +14,9 @@ public import core.sys.posix.config;
> > // NOTE: When adding newer versions of FreeBSD, verify all current versioned
> > // bindings are still compatible with the release.
> > 
> > - version (FreeBSD_13) enum __FreeBSD_version = 130;
> > -else version (FreeBSD_12) enum __FreeBSD_version = 1202000;
> > + version (FreeBSD_14) enum __FreeBSD_version = 140;
> > +else version (FreeBSD_13) enum __FreeBSD_version = 1301000;
> > +else version (FreeBSD_12) enum __FreeBSD_version = 1203000;
> > else version (FreeBSD_11) enum __FreeBSD_version = 1104000;
> > else version (FreeBSD_10) enum __FreeBSD_version = 1004000;
> > else version (FreeBSD_9) enum __FreeBSD_version = 903000;
> > --
> > 2.38.0


Re: [PATCH (pushed)] docs: document --param=ipa-sra-ptrwrap-growth-factor

2022-12-14 Thread Martin Jambor
Hi,

On Wed, Dec 14 2022, Martin Liška wrote:
> gcc/ChangeLog:
>
>   * doc/invoke.texi: Document ipa-sra-ptrwrap-growth-factor.

Thanks or spotting this.  Seeing the email also averted me to the fact
that what I wrote in the parameter description is almost the opposite of
what it should say.  I'll fix that momentarily with the following patch.

Thanks again,

Martin



Somehow I made the description of the parameter almost the opposite of
what I wanted to say.  Fixed by this patch.

Tested by building gcc on x86_64-linux and make info and make pdf.

gcc/ChangeLog:

2022-12-14  Martin Jambor  

* doc/invoke.texi (ipa-sra-ptrwrap-growth-factor): Fix the
description.
* params.opt (ipa-sra-ptrwrap-growth-factor): Likewise.
---
 gcc/doc/invoke.texi | 3 ++-
 gcc/params.opt  | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 7dc1d45e275..f48df64cc2a 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15523,7 +15523,8 @@ pointer parameter.
 @item ipa-sra-ptrwrap-growth-factor
 Additional maximum allowed growth of total size of new parameters
 that ipa-sra replaces a pointer to an aggregate with,
-if it points to a local variable that the caller never writes to.
+if it points to a local variable that the caller only writes to and
+passes it as an argument to other functions.
 
 @item ipa-sra-max-replacements
 Maximum pieces of an aggregate that IPA-SRA tracks.  As a
diff --git a/gcc/params.opt b/gcc/params.opt
index e0fd05fb44a..e0380bf10f9 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -296,7 +296,7 @@ Maximum allowed growth of total size of new parameters that 
ipa-sra replaces a p
 
 -param=ipa-sra-ptrwrap-growth-factor=
 Common Joined UInteger Var(param_ipa_sra_ptrwrap_growth_factor) Init(4) 
IntegerRange(1, 8) Param Optimization
-Additional maximum allowed growth of total size of new parameters that ipa-sra 
replaces a pointer to an aggregate with, if it points to a local variable that 
the caller never writes to.
+Additional maximum allowed growth of total size of new parameters that ipa-sra 
replaces a pointer to an aggregate with, if it points to a local variable that 
the caller only writes to and passes it as an argument to functions.
 
 -param=ira-loop-reserved-regs=
 Common Joined UInteger Var(param_ira_loop_reserved_regs) Init(2) Param 
Optimization
-- 
2.38.1



Re: Ping---[V3][PATCH 2/2] Add a new warning option -Wstrict-flex-arrays.

2022-12-14 Thread Qing Zhao via Gcc-patches


> On Dec 14, 2022, at 4:03 AM, Richard Biener  wrote:
> 
> On Tue, 13 Dec 2022, Qing Zhao wrote:
> 
>> Richard, 
>> 
>> Do you have any decision on this one? 
>> Do we need this warning option For GCC? 
> 
> Looking at the testcases it seems that the diagnostic amends
> -Warray-bounds diagnostics for trailing but not flexible arrays?

Yes.

> Wouldn't it be better to generally diagnose this, so have
> -Warray-bounds, with -fstrict-flex-arrays, for
> 
> struct X { int a[1]; };
> int foo (struct X *p)
> {
>  return p->a[1];
> }
> 
> emit
> 
> warning: array subscript 1 is above array bounds ...
> note: the trailing array is only a flexible array member with 
> -fno-strict-flex-arrays

This is good too.
My only concern with doing this is, the default warning messages of 
-Warray-bounds would be different than
the current ones, will this have any impact on the current users?

> 
> ?  Having -Wstrict-flex-arrays=N and N not agree with the
> -fstrict-flex-arrays level sounds hardly useful to me but the
> information that we ran into a trailing array but didn't consider
> it a flex array because of -fstrict-flex-arrays is always a
> useful information?

-Wstrict-flex-arrays does NOT have the argument “N”.  Its level will be 
consistent with the level “N” of the corresponding
-fstrict-flex-array=N.  
-Wstrict-flex-arrays option is only valid when -fstrict-flex-arrays is present, 
it will report any misuse of treating trailing array 
as flexible array at the LEVEL of -fstrict-flex-arrays. 

Let me know if it is still not very clear.

thanks.

Qing
> 
> But maybe I misunderstood this new diagnostic?
> 
> Thanks,
> Richard.
> 
> 
>> thanks.
>> 
>> Qing
>> 
>>> On Dec 6, 2022, at 11:18 AM, Qing Zhao  wrote:
>>> 
>>> '-Wstrict-flex-arrays'
>>>Warn about inproper usages of flexible array members according to
>>>the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
>>>the trailing array field of a structure if it's available,
>>>otherwise according to the LEVEL of the option
>>>'-fstrict-flex-arrays=LEVEL'.
>>> 
>>>This option is effective only when LEVEL is bigger than 0.
>>>Otherwise, it will be ignored with a warning.
>>> 
>>>when LEVEL=1, warnings will be issued for a trailing array
>>>reference of a structure that have 2 or more elements if the
>>>trailing array is referenced as a flexible array member.
>>> 
>>>when LEVEL=2, in addition to LEVEL=1, additional warnings will be
>>>issued for a trailing one-element array reference of a structure if
>>>the array is referenced as a flexible array member.
>>> 
>>>when LEVEL=3, in addition to LEVEL=2, additional warnings will be
>>>issued for a trailing zero-length array reference of a structure if
>>>the array is referenced as a flexible array member.
>>> 
>>> gcc/ChangeLog:
>>> 
>>> * doc/invoke.texi: Document -Wstrict-flex-arrays option.
>>> * gimple-array-bounds.cc (check_out_of_bounds_and_warn): Add two more
>>> arguments.
>>> (array_bounds_checker::check_array_ref): Issue warnings for
>>> -Wstrict-flex-arrays.
>>> * opts.cc (finish_options): Issue warning for unsupported combination
>>> of -Wstrict_flex_arrays and -fstrict-flex-array.
>>> * tree-vrp.cc (execute_ranger_vrp): Enable the pass when
>>> warn_strict_flex_array is true.
>>> 
>>> gcc/c-family/ChangeLog:
>>> 
>>> * c.opt (Wstrict-flex-arrays): New option.
>>> 
>>> gcc/testsuite/ChangeLog:
>>> 
>>> * gcc.dg/Warray-bounds-flex-arrays-1.c: Update testing case with
>>> -Wstrict-flex-arrays.
>>> * gcc.dg/Warray-bounds-flex-arrays-2.c: Likewise.
>>> * gcc.dg/Warray-bounds-flex-arrays-3.c: Likewise.
>>> * gcc.dg/Warray-bounds-flex-arrays-4.c: Likewise.
>>> * gcc.dg/Warray-bounds-flex-arrays-5.c: Likewise.
>>> * gcc.dg/Warray-bounds-flex-arrays-6.c: Likewise.
>>> * c-c++-common/Wstrict-flex-arrays.c: New test.
>>> * gcc.dg/Wstrict-flex-arrays-2.c: New test.
>>> * gcc.dg/Wstrict-flex-arrays-3.c: New test.
>>> * gcc.dg/Wstrict-flex-arrays.c: New test.
>>> ---
>>> gcc/c-family/c.opt|   5 +
>>> gcc/doc/invoke.texi   |  27 -
>>> gcc/gimple-array-bounds.cc| 103 ++
>>> gcc/opts.cc   |   8 ++
>>> .../c-c++-common/Wstrict-flex-arrays.c|   9 ++
>>> .../gcc.dg/Warray-bounds-flex-arrays-1.c  |   5 +-
>>> .../gcc.dg/Warray-bounds-flex-arrays-2.c  |   6 +-
>>> .../gcc.dg/Warray-bounds-flex-arrays-3.c  |   7 +-
>>> .../gcc.dg/Warray-bounds-flex-arrays-4.c  |   5 +-
>>> .../gcc.dg/Warray-bounds-flex-arrays-5.c  |   6 +-
>>> .../gcc.dg/Warray-bounds-flex-arrays-6.c  |   7 +-
>>> gcc/testsuite/gcc.dg/Wstrict-flex-arrays-2.c  |  39 +++
>>> gcc/testsuite/gcc.dg/Wstrict-flex-arrays-3.c  |  39 +++
>>> gcc/testsuite/gcc.dg/Wstrict-flex-arrays.c|  39 +++
>>> gcc/tree-vrp.cc   |   2 +-

[committed] libstdc++: Fix size passed to operator delete [PR108097]

2022-12-14 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

The number of elements gets stored in _M_capacity so use a separate
variable for the number of bytes to allocate.

libstdc++-v3/ChangeLog:

PR libstdc++/108097
* include/std/stacktrace (basic_stracktrace::_Impl): Do not
multiply N by sizeof(value_type) when allocating.
---
 libstdc++-v3/include/std/stacktrace | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/std/stacktrace 
b/libstdc++-v3/include/std/stacktrace
index 83c6463b0d8..402be3e828e 100644
--- a/libstdc++-v3/include/std/stacktrace
+++ b/libstdc++-v3/include/std/stacktrace
@@ -608,8 +608,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{
  if constexpr (is_same_v>)
{
- __n *= sizeof(value_type);
- void* const __p = _GLIBCXX_OPERATOR_NEW (__n, nothrow_t{});
+ // For std::allocator we use nothrow-new directly so we
+ // don't need to handle bad_alloc exceptions.
+ size_t __nb = __n * sizeof(value_type);
+ void* const __p = _GLIBCXX_OPERATOR_NEW (__nb, nothrow_t{});
  if (__p == nullptr) [[unlikely]]
return nullptr;
  _M_frames = static_cast(__p);
-- 
2.38.1



Re: Ping---[V3][PATCH 2/2] Add a new warning option -Wstrict-flex-arrays.

2022-12-14 Thread Qing Zhao via Gcc-patches


> On Dec 14, 2022, at 9:08 AM, Qing Zhao via Gcc-patches 
>  wrote:
> 
> 
> 
>> On Dec 14, 2022, at 4:03 AM, Richard Biener  wrote:
>> 
>> On Tue, 13 Dec 2022, Qing Zhao wrote:
>> 
>>> Richard, 
>>> 
>>> Do you have any decision on this one? 
>>> Do we need this warning option For GCC? 
>> 
>> Looking at the testcases it seems that the diagnostic amends
>> -Warray-bounds diagnostics for trailing but not flexible arrays?
> 
> Yes.
> 
>> Wouldn't it be better to generally diagnose this, so have
>> -Warray-bounds, with -fstrict-flex-arrays, for
>> 
>> struct X { int a[1]; };
>> int foo (struct X *p)
>> {
>> return p->a[1];
>> }
>> 
>> emit
>> 
>> warning: array subscript 1 is above array bounds ...
>> note: the trailing array is only a flexible array member with 
>> -fno-strict-flex-arrays
> 
> This is good too.
> My only concern with doing this is, the default warning messages of 
> -Warray-bounds would be different than
> the current ones, will this have any impact on the current users?

My bad, the default warning message of -Warray-bounds without 
-fstrict-flex-arrays should not be changed.
Only when -fstrict-flex-arrays=N (N>0), the warning messages of -Warray-bounds 
will be different than the current one.
This should be fine.
> 
>> 
>> ?  Having -Wstrict-flex-arrays=N and N not agree with the
>> -fstrict-flex-arrays level sounds hardly useful to me but the
>> information that we ran into a trailing array but didn't consider
>> it a flex array because of -fstrict-flex-arrays is always a
>> useful information?
> 
> -Wstrict-flex-arrays does NOT have the argument “N”.  Its level will be 
> consistent with the level “N” of the corresponding
> -fstrict-flex-array=N.  
> -Wstrict-flex-arrays option is only valid when -fstrict-flex-arrays is 
> present, it will report any misuse of treating trailing array 
> as flexible array at the LEVEL of -fstrict-flex-arrays. 
> 
> Let me know if it is still not very clear.
> 
> thanks.
> 
> Qing
>> 
>> But maybe I misunderstood this new diagnostic?
>> 
>> Thanks,
>> Richard.
>> 
>> 
>>> thanks.
>>> 
>>> Qing
>>> 
 On Dec 6, 2022, at 11:18 AM, Qing Zhao  wrote:
 
 '-Wstrict-flex-arrays'
   Warn about inproper usages of flexible array members according to
   the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
   the trailing array field of a structure if it's available,
   otherwise according to the LEVEL of the option
   '-fstrict-flex-arrays=LEVEL'.
 
   This option is effective only when LEVEL is bigger than 0.
   Otherwise, it will be ignored with a warning.
 
   when LEVEL=1, warnings will be issued for a trailing array
   reference of a structure that have 2 or more elements if the
   trailing array is referenced as a flexible array member.
 
   when LEVEL=2, in addition to LEVEL=1, additional warnings will be
   issued for a trailing one-element array reference of a structure if
   the array is referenced as a flexible array member.
 
   when LEVEL=3, in addition to LEVEL=2, additional warnings will be
   issued for a trailing zero-length array reference of a structure if
   the array is referenced as a flexible array member.
 
 gcc/ChangeLog:
 
* doc/invoke.texi: Document -Wstrict-flex-arrays option.
* gimple-array-bounds.cc (check_out_of_bounds_and_warn): Add two more
arguments.
(array_bounds_checker::check_array_ref): Issue warnings for
-Wstrict-flex-arrays.
* opts.cc (finish_options): Issue warning for unsupported combination
of -Wstrict_flex_arrays and -fstrict-flex-array.
* tree-vrp.cc (execute_ranger_vrp): Enable the pass when
warn_strict_flex_array is true.
 
 gcc/c-family/ChangeLog:
 
* c.opt (Wstrict-flex-arrays): New option.
 
 gcc/testsuite/ChangeLog:
 
* gcc.dg/Warray-bounds-flex-arrays-1.c: Update testing case with
-Wstrict-flex-arrays.
* gcc.dg/Warray-bounds-flex-arrays-2.c: Likewise.
* gcc.dg/Warray-bounds-flex-arrays-3.c: Likewise.
* gcc.dg/Warray-bounds-flex-arrays-4.c: Likewise.
* gcc.dg/Warray-bounds-flex-arrays-5.c: Likewise.
* gcc.dg/Warray-bounds-flex-arrays-6.c: Likewise.
* c-c++-common/Wstrict-flex-arrays.c: New test.
* gcc.dg/Wstrict-flex-arrays-2.c: New test.
* gcc.dg/Wstrict-flex-arrays-3.c: New test.
* gcc.dg/Wstrict-flex-arrays.c: New test.
 ---
 gcc/c-family/c.opt|   5 +
 gcc/doc/invoke.texi   |  27 -
 gcc/gimple-array-bounds.cc| 103 ++
 gcc/opts.cc   |   8 ++
 .../c-c++-common/Wstrict-flex-arrays.c|   9 ++
 .../gcc.dg/Warray-bounds-flex-arrays-1.c  |   5 +-
 .../gcc.dg/Warray-bounds-flex-arrays-2.c  |   6 +-
 .../gcc.dg/Warray-bounds-flex-arrays-3.c  

Re: [pushed] c++: avoid initializer_list [PR105838]

2022-12-14 Thread Stephan Bergmann via Gcc-patches

On 12/8/22 19:41, Jason Merrill via Gcc-patches wrote:

Tested x86_64-pc-linux-gnu, applying to trunk.


Bisecting shows this started to break


$ cat test.cc
#include 
template struct ConstCharArrayDetector;
template struct ConstCharArrayDetector { using Type = 
int; };
struct OUString {
template OUString(T &, typename ConstCharArrayDetector::Type 
= 0);
};
std::vector f() { return {""}; }



$ g++ -fsyntax-only test.cc
In file included from .../include/c++/13.0.0/vector:65,
 from test.cc:1:
.../include/c++/13.0.0/bits/stl_uninitialized.h: In instantiation of ‘constexpr 
bool std::__check_constructible() [with _ValueType = OUString; _Tp = const char* 
const&]’:
.../include/c++/13.0.0/bits/stl_uninitialized.h:182:4:   required from 
‘_ForwardIterator std::uninitialized_copy(_InputIterator, _InputIterator, 
_ForwardIterator) [with _InputIterator = const char* const*; _ForwardIterator = 
OUString*]’
.../include/c++/13.0.0/bits/stl_uninitialized.h:373:37:   required from ‘_ForwardIterator 
std::__uninitialized_copy_a(_InputIterator, _InputIterator, _ForwardIterator, 
allocator<_Tp>&) [with _InputIterator = const char* const*; _ForwardIterator = 
OUString*; _Tp = OUString]’
.../include/c++/13.0.0/bits/stl_vector.h:1690:33:   required from ‘void std::vector<_Tp, 
_Alloc>::_M_range_initialize(_ForwardIterator, _ForwardIterator, 
std::forward_iterator_tag) [with _ForwardIterator = const char* const*; _Tp = OUString; 
_Alloc = std::allocator]’
.../include/c++/13.0.0/bits/stl_vector.h:706:23:   required from ‘std::vector<_Tp, 
_Alloc>::vector(_InputIterator, _InputIterator, const allocator_type&) [with _InputIterator = const 
char* const*;  = void; _Tp = OUString; _Alloc = 
std::allocator; allocator_type = std::allocator]’
test.cc:7:39:   required from here
.../include/c++/13.0.0/bits/stl_uninitialized.h:90:56: error: static assertion 
failed: result type must be constructible from input type
   90 |   static_assert(is_constructible<_ValueType, _Tp>::value,
  |^
.../include/c++/13.0.0/bits/stl_uninitialized.h:90:56: note: 
‘std::integral_constant::value’ evaluates to false




Re: [PATCH] libgccjit: Fix a failing test

2022-12-14 Thread Antoni Boucher via Gcc-patches
Thanks!

In your patch, you're missing this line at the end of the commit
message:

   Signed-off-by: Guillaume Gomez 

On Wed, 2022-12-14 at 14:39 +0100, Guillaume Gomez via Jit wrote:
> Hi,
> 
> This fixes bug 107999.
> 
> Thanks in advance for the review.



[PATCH] ipa-sra: Fix address escape case when detecting Fortran descriptors

2022-12-14 Thread Martin Jambor
Hi,

The discussion about scan_expr_access in ipa-sra.cc brought my
attention to a missing case of handling an ADDR_EXPR.  As the added
testcase shows, the heuristics which looks for parameters which are
local variables that are only written to and passed by reference in
calls can miss a case where the address of the variable in question is
stored elsewhere in an assignment.

This patch adds that case to the function and also adds the
optimization that Richi suggested, i.e. bailing out early on simple
SSA_NAMEs and constant trees.

The patch is undergoing bootstrap and testing on an x86_64-linux right
now.  OK if it passes?

Thanks,

Martin


gcc/ChangeLog:

2022-12-14  Martin Jambor  

* ipa-sra.cc (loaded_decls): Adjust comment.
(scan_expr_access): Also detect assignments of address of local
variables to a variable.  Bail out early on SSA_NAMEs and
constants as an optimization.

gcc/testsuite/ChangeLog:

2022-12-14  Martin Jambor  

* gcc.dg/ipa/ipa-sra-29.c: New test.
---
 gcc/ipa-sra.cc| 16 ++-
 gcc/testsuite/gcc.dg/ipa/ipa-sra-29.c | 38 +++
 2 files changed, 53 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/ipa-sra-29.c

diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
index 93f5e34b15c..bcabdedfc6c 100644
--- a/gcc/ipa-sra.cc
+++ b/gcc/ipa-sra.cc
@@ -592,7 +592,8 @@ namespace {
 
 hash_map *decl2desc;
 
-/* All local DECLs ever loaded from.  */
+/* All local DECLs ever loaded from of and of those that have their address
+   assigned to a variable.  */
 
 hash_set  *loaded_decls;
 
@@ -1743,6 +1744,19 @@ scan_expr_access (tree expr, gimple *stmt, 
isra_scan_context ctx,
   bool deref = false;
   bool reverse;
 
+  if (TREE_CODE (expr) == ADDR_EXPR)
+{
+  if (ctx == ISRA_CTX_ARG)
+   return;
+  tree t = get_base_address (TREE_OPERAND (expr, 0));
+  if (TREE_CODE (t) == VAR_DECL && !TREE_STATIC (t))
+   loaded_decls->add (t);
+  return;
+}
+  if (TREE_CODE (expr) == SSA_NAME
+  || CONSTANT_CLASS_P (expr))
+return;
+
   if (TREE_CODE (expr) == BIT_FIELD_REF
   || TREE_CODE (expr) == IMAGPART_EXPR
   || TREE_CODE (expr) == REALPART_EXPR)
diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-29.c 
b/gcc/testsuite/gcc.dg/ipa/ipa-sra-29.c
new file mode 100644
index 000..aee45ea0e8f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-29.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-ipa-sra-details"  } */
+
+struct S
+{
+  float f;
+  int i;
+  void *p;
+};
+
+extern struct S *gp;
+int baz (float);
+
+static int
+__attribute__((noinline))
+bar (struct S *p)
+{
+  if (p->i != 6)
+__builtin_abort ();
+
+  return baz(p->f);
+}
+
+int
+foo (void)
+{
+  struct S s;
+
+  gp = &s;
+  s.f = 7.4;
+  s.i = 6;
+  s.p = &s;
+
+  bar (&s);
+  return 0;
+}
+
+/* { dg-final { scan-ipa-dump-not "Variable constructed just to be passed to 
calls" "sra" } } */
-- 
2.38.1



[PATCH] ipa-sra: Consider the first parameter of methods safe to dereference

2022-12-14 Thread Martin Jambor
Hi,

Honza requested this after reviewing the patch that taught IPA-SRA
that REFERENCE_TYPEs are always non-NULL that the pass also handles
the first parameters of methods, this pointers, in the same way.  So
this patch does that.

The patch is undergoing bootstrap and testing on an x86_64-linux right
now.  OK if it passes?

Thanks,

Martin


gcc/ChangeLog:

2022-12-14  Martin Jambor  

* ipa-sra.cc (create_parameter_descriptors): Consider the first
parameter of a method safe to dereference.

gcc/testsuite/ChangeLog:

2022-12-14  Martin Jambor  

* g++.dg/ipa/ipa-sra-6.C: New test.
---
 gcc/ipa-sra.cc   |  7 +++-
 gcc/testsuite/g++.dg/ipa/ipa-sra-6.C | 62 
 2 files changed, 68 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/ipa/ipa-sra-6.C

diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
index bcabdedfc6c..6fe336eeb19 100644
--- a/gcc/ipa-sra.cc
+++ b/gcc/ipa-sra.cc
@@ -1206,7 +1206,12 @@ create_parameter_descriptors (cgraph_node *node,
   if (POINTER_TYPE_P (type))
{
  desc->by_ref = true;
- desc->safe_ref = (TREE_CODE (type) == REFERENCE_TYPE);
+ if (TREE_CODE (type) == REFERENCE_TYPE
+ || (num == 0
+ && TREE_CODE (TREE_TYPE (node->decl)) == METHOD_TYPE))
+   desc->safe_ref = true;
+ else
+   desc->safe_ref = false;
  type = TREE_TYPE (type);
 
  if (TREE_CODE (type) == FUNCTION_TYPE
diff --git a/gcc/testsuite/g++.dg/ipa/ipa-sra-6.C 
b/gcc/testsuite/g++.dg/ipa/ipa-sra-6.C
new file mode 100644
index 000..d6b7822533f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/ipa-sra-6.C
@@ -0,0 +1,62 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-ipa-sra"  } */
+
+namespace {
+
+class C
+{
+
+  int mi;
+
+public:
+  C (int i)
+: mi(i)
+  {}
+
+  void foo (int c);
+};
+
+volatile int vi;
+
+
+void __attribute__((noinline))
+C::foo (int cond)
+{
+  int i;
+  if (cond)
+i = mi;
+  else
+i = 0;
+  vi = i;
+}
+
+static C c_instance(1);
+}
+
+void __attribute__((noinline))
+bar (C *p, int cond)
+{
+  p->foo (cond);
+}
+
+
+class C *gp;
+
+void something(void);
+
+void
+baz (int cond)
+{
+  C c(vi);
+  gp = &c;
+  something ();
+  bar (gp, cond);
+}
+
+void
+hoo(void)
+{
+  gp = &c_instance;
+}
+
+/* { dg-final { scan-ipa-dump "Will split parameter" "sra" } } */
-- 
2.38.1



Re: [PATCH] ipa-sra: Consider the first parameter of methods safe to dereference

2022-12-14 Thread Jan Hubicka via Gcc-patches
> Hi,
> 
> Honza requested this after reviewing the patch that taught IPA-SRA
> that REFERENCE_TYPEs are always non-NULL that the pass also handles
> the first parameters of methods, this pointers, in the same way.  So
> this patch does that.
> 
> The patch is undergoing bootstrap and testing on an x86_64-linux right
> now.  OK if it passes?
> 
> Thanks,
> 
> Martin
> 
> 
> gcc/ChangeLog:
> 
> 2022-12-14  Martin Jambor  
> 
>   * ipa-sra.cc (create_parameter_descriptors): Consider the first
>   parameter of a method safe to dereference.
> 
> gcc/testsuite/ChangeLog:
> 
> 2022-12-14  Martin Jambor  
> 
>   * g++.dg/ipa/ipa-sra-6.C: New test.

OK,
thanks a lot!
Honza


Re: [PATCH] ipa-sra: Fix address escape case when detecting Fortran descriptors

2022-12-14 Thread Jan Hubicka via Gcc-patches
> Hi,
> 
> The discussion about scan_expr_access in ipa-sra.cc brought my
> attention to a missing case of handling an ADDR_EXPR.  As the added
> testcase shows, the heuristics which looks for parameters which are
> local variables that are only written to and passed by reference in
> calls can miss a case where the address of the variable in question is
> stored elsewhere in an assignment.
> 
> This patch adds that case to the function and also adds the
> optimization that Richi suggested, i.e. bailing out early on simple
> SSA_NAMEs and constant trees.
> 
> The patch is undergoing bootstrap and testing on an x86_64-linux right
> now.  OK if it passes?
> 
> Thanks,
> 
> Martin
> 
> 
> gcc/ChangeLog:
> 
> 2022-12-14  Martin Jambor  
> 
>   * ipa-sra.cc (loaded_decls): Adjust comment.
>   (scan_expr_access): Also detect assignments of address of local
>   variables to a variable.  Bail out early on SSA_NAMEs and
>   constants as an optimization.
> 
> gcc/testsuite/ChangeLog:
> 
> 2022-12-14  Martin Jambor  
> 
>   * gcc.dg/ipa/ipa-sra-29.c: New test.

OK,
Thanks!
Honza


[PATCH 10/15 V6] arm: Implement cortex-M return signing address codegen

2022-12-14 Thread Andrea Corallo via Gcc-patches
Richard Earnshaw  writes:

[...]

>
> +  if (TARGET_TPCS_FRAME)
> +error ("Return address signing and %<-mtpcs-frame%> are
> incompatible.");
>
> So really this is 'not implemented' rather than not compatible - I
> don't see why we couldn't implement this if we really wanted to.  It's
> not worth implementing it because tpcs-frames are very much legacy
> these days.
>
> So the message should use sorry() and say 'is not supported' rather
> than 'are incompatible'.
>
> +(define_insn "pacbti_nop"
> +  [(set (reg:SI IP_REGNUM)
> + (unspec:SI [(reg:SI SP_REGNUM) (reg:SI LR_REGNUM)]
> +VUNSPEC_PACBTI_NOP))]
>
> No, this needs to be unspec_volatile, not unspec.
>
> +(define_insn "aut_nop"
> +  [(unspec:SI [(reg:SI IP_REGNUM) (reg:SI SP_REGNUM) (reg:SI LR_REGNUM)]
> +   VUNSPEC_AUT_NOP)]
>
> Similarly.
>
> R.


Hi Richard & all,

please find attached the updated patch implementing suggestions.

BR

  Andrea

>From adabef75c4af91865b0639243d6d9aa03bf8ad68 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Thu, 20 Jan 2022 15:36:23 +0100
Subject: [PATCH] [PATCH 10/15] arm: Implement cortex-M return signing address
 codegen

Hi all,

this patch enables address return signature and verification based on
Armv8.1-M Pointer Authentication [1].

To sign the return address, we use the PAC R12, LR, SP instruction
upon function entry.  This is signing LR using SP and storing the
result in R12.  R12 will be pushed into the stack.

During function epilogue R12 will be popped and AUT R12, LR, SP will
be used to verify that the content of LR is still valid before return.

Here an example of PAC instrumented function prologue and epilogue:

void foo (void);

int main()
{
  foo ();
  return 0;
}

Compiled with '-march=armv8.1-m.main -mbranch-protection=pac-ret
-mthumb' translates into:

main:
pac ip, lr, sp
push{r3, r7, ip, lr}
add r7, sp, #0
bl  foo
movsr3, #0
mov r0, r3
pop {r3, r7, ip, lr}
aut ip, lr, sp
bx  lr

The patch also takes care of generating a PACBTI instruction in place
of the sequence BTI+PAC when Branch Target Identification is enabled
contextually.

Ex. the previous example compiled with '-march=armv8.1-m.main
-mbranch-protection=pac-ret+bti -mthumb' translates into:

main:
pacbti  ip, lr, sp
push{r3, r7, ip, lr}
add r7, sp, #0
bl  foo
movsr3, #0
mov r0, r3
pop {r3, r7, ip, lr}
aut ip, lr, sp
bx  lr

As part of previous upstream suggestions a test for varargs has been
added and '-mtpcs-frame' is deemed being incompatible with this return
signing address feature being introduced.

[1] 


gcc/Changelog

2021-11-03  Andrea Corallo  

* config/arm/arm.h (arm_arch8m_main): Declare it.
* config/arm/arm.cc (arm_arch8m_main): Define it.
(arm_option_reconfigure_globals): Set arm_arch8m_main.
(arm_compute_frame_layout, arm_expand_prologue)
(thumb2_expand_return, arm_expand_epilogue)
(arm_conditional_register_usage): Update for pac codegen.
(arm_current_function_pac_enabled_p): New function.
(aarch_bti_enabled) New function.
* config/arm/arm.md (pac_ip_lr_sp, pacbti_ip_lr_sp, aut_ip_lr_sp):
Add new patterns.
* config/arm/unspecs.md (UNSPEC_PAC_NOP)
(VUNSPEC_PACBTI_NOP, VUNSPEC_AUT_NOP): Add unspecs.

gcc/testsuite/Changelog

2021-11-03  Andrea Corallo  

* gcc.target/arm/pac.h : New file.
* gcc.target/arm/pac-1.c : New test case.
* gcc.target/arm/pac-2.c : Likewise.
* gcc.target/arm/pac-3.c : Likewise.
* gcc.target/arm/pac-4.c : Likewise.
* gcc.target/arm/pac-5.c : Likewise.
* gcc.target/arm/pac-6.c : Likewise.
* gcc.target/arm/pac-7.c : Likewise.
* gcc.target/arm/pac-8.c : Likewise.
* gcc.target/arm/pac-9.c : Likewise.
* gcc.target/arm/pac-10.c : Likewise.
* gcc.target/arm/pac-11.c : Likewise.
---
 gcc/config/arm/arm-protos.h   |  1 +
 gcc/config/arm/arm.cc | 74 +++
 gcc/config/arm/arm.h  |  4 ++
 gcc/config/arm/arm.md | 23 +
 gcc/config/arm/unspecs.md |  3 ++
 gcc/testsuite/gcc.target/arm/pac-1.c  | 11 
 gcc/testsuite/gcc.target/arm/pac-10.c | 10 
 gcc/testsuite/gcc.target/arm/pac-11.c | 10 
 gcc/testsuite/gcc.target/arm/pac-2.c  | 11 
 gcc/testsuite/gcc.target/arm/pac-3.c  | 11 
 gcc/testsuite/gcc.target/arm/pac-4.c  | 10 
 gcc/testsuite/gcc.target/arm/pac-5.c  | 28 ++
 gcc/testsuite/gcc.target/arm/pac-6.c  | 18 +++
 gcc/testsuite/gcc.target/arm/pac-7.c  | 32 
 gcc/testsuite/gcc.

Re: Ping---[V3][PATCH 2/2] Add a new warning option -Wstrict-flex-arrays.

2022-12-14 Thread Qing Zhao via Gcc-patches
Hi, Richard,

I guess that we now agreed on the following:

“ the information that we ran into a trailing array but didn't consider 
it a flex array because of -fstrict-flex-arrays is always a useful information”

The only thing we didn’t decide is:

A. Amend such new information to -Warray-bounds when -fstrict-flex-arrays=N 
(N>0) specified.

OR

B. Issue such new information with a new warning option -Wstrict-flex-arrays 
when -fstrict-flex-arrays=N (N>0) specified.

My current patch implemented B. 

If you think A is better, I will change the patch as A. 

Let me know your opinion.

thanks.

Qing


> On Dec 14, 2022, at 4:03 AM, Richard Biener  wrote:
> 
> On Tue, 13 Dec 2022, Qing Zhao wrote:
> 
>> Richard, 
>> 
>> Do you have any decision on this one? 
>> Do we need this warning option For GCC? 
> 
> Looking at the testcases it seems that the diagnostic amends
> -Warray-bounds diagnostics for trailing but not flexible arrays?
> Wouldn't it be better to generally diagnose this, so have
> -Warray-bounds, with -fstrict-flex-arrays, for
> 
> struct X { int a[1]; };
> int foo (struct X *p)
> {
>  return p->a[1];
> }
> 
> emit
> 
> warning: array subscript 1 is above array bounds ...
> note: the trailing array is only a flexible array member with 
> -fno-strict-flex-arrays
> 
> ?  Having -Wstrict-flex-arrays=N and N not agree with the
> -fstrict-flex-arrays level sounds hardly useful to me but the
> information that we ran into a trailing array but didn't consider
> it a flex array because of -fstrict-flex-arrays is always a
> useful information?
> 
> But maybe I misunderstood this new diagnostic?
> 
> Thanks,
> Richard.
> 
> 
>> thanks.
>> 
>> Qing
>> 
>>> On Dec 6, 2022, at 11:18 AM, Qing Zhao  wrote:
>>> 
>>> '-Wstrict-flex-arrays'
>>>Warn about inproper usages of flexible array members according to
>>>the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
>>>the trailing array field of a structure if it's available,
>>>otherwise according to the LEVEL of the option
>>>'-fstrict-flex-arrays=LEVEL'.
>>> 
>>>This option is effective only when LEVEL is bigger than 0.
>>>Otherwise, it will be ignored with a warning.
>>> 
>>>when LEVEL=1, warnings will be issued for a trailing array
>>>reference of a structure that have 2 or more elements if the
>>>trailing array is referenced as a flexible array member.
>>> 
>>>when LEVEL=2, in addition to LEVEL=1, additional warnings will be
>>>issued for a trailing one-element array reference of a structure if
>>>the array is referenced as a flexible array member.
>>> 
>>>when LEVEL=3, in addition to LEVEL=2, additional warnings will be
>>>issued for a trailing zero-length array reference of a structure if
>>>the array is referenced as a flexible array member.
>>> 
>>> gcc/ChangeLog:
>>> 
>>> * doc/invoke.texi: Document -Wstrict-flex-arrays option.
>>> * gimple-array-bounds.cc (check_out_of_bounds_and_warn): Add two more
>>> arguments.
>>> (array_bounds_checker::check_array_ref): Issue warnings for
>>> -Wstrict-flex-arrays.
>>> * opts.cc (finish_options): Issue warning for unsupported combination
>>> of -Wstrict_flex_arrays and -fstrict-flex-array.
>>> * tree-vrp.cc (execute_ranger_vrp): Enable the pass when
>>> warn_strict_flex_array is true.
>>> 
>>> gcc/c-family/ChangeLog:
>>> 
>>> * c.opt (Wstrict-flex-arrays): New option.
>>> 
>>> gcc/testsuite/ChangeLog:
>>> 
>>> * gcc.dg/Warray-bounds-flex-arrays-1.c: Update testing case with
>>> -Wstrict-flex-arrays.
>>> * gcc.dg/Warray-bounds-flex-arrays-2.c: Likewise.
>>> * gcc.dg/Warray-bounds-flex-arrays-3.c: Likewise.
>>> * gcc.dg/Warray-bounds-flex-arrays-4.c: Likewise.
>>> * gcc.dg/Warray-bounds-flex-arrays-5.c: Likewise.
>>> * gcc.dg/Warray-bounds-flex-arrays-6.c: Likewise.
>>> * c-c++-common/Wstrict-flex-arrays.c: New test.
>>> * gcc.dg/Wstrict-flex-arrays-2.c: New test.
>>> * gcc.dg/Wstrict-flex-arrays-3.c: New test.
>>> * gcc.dg/Wstrict-flex-arrays.c: New test.
>>> ---
>>> gcc/c-family/c.opt|   5 +
>>> gcc/doc/invoke.texi   |  27 -
>>> gcc/gimple-array-bounds.cc| 103 ++
>>> gcc/opts.cc   |   8 ++
>>> .../c-c++-common/Wstrict-flex-arrays.c|   9 ++
>>> .../gcc.dg/Warray-bounds-flex-arrays-1.c  |   5 +-
>>> .../gcc.dg/Warray-bounds-flex-arrays-2.c  |   6 +-
>>> .../gcc.dg/Warray-bounds-flex-arrays-3.c  |   7 +-
>>> .../gcc.dg/Warray-bounds-flex-arrays-4.c  |   5 +-
>>> .../gcc.dg/Warray-bounds-flex-arrays-5.c  |   6 +-
>>> .../gcc.dg/Warray-bounds-flex-arrays-6.c  |   7 +-
>>> gcc/testsuite/gcc.dg/Wstrict-flex-arrays-2.c  |  39 +++
>>> gcc/testsuite/gcc.dg/Wstrict-flex-arrays-3.c  |  39 +++
>>> gcc/testsuite/gcc.dg/Wstrict-flex-arrays.c|  39 +++
>>> gcc/tree-vrp.cc   |  

[PATCH 12/15 V4] arm: implement bti injection

2022-12-14 Thread Andrea Corallo via Gcc-patches
Hi Richard,

thanks for reviewing.

Richard Earnshaw  writes:

> On 28/10/2022 17:40, Andrea Corallo via Gcc-patches wrote:
>> Hi all,
>> please find attached the third iteration of this patch addresing
>> review
>> comments.
>> Thanks
>>Andrea
>> 
>
> @@ -23374,12 +23374,6 @@ output_probe_stack_range (rtx reg1, rtx reg2)
>return "";
>  }
>
> -static bool
> -aarch_bti_enabled ()
> -{
> -  return false;
> -}
> -
>  /* Generate the prologue instructions for entry into an ARM or Thumb-2
> function.  */
>  void
> @@ -32992,6 +32986,61 @@ arm_current_function_pac_enabled_p (void)
>&& !crtl->is_leaf));
>  }
>
> +/* Return TRUE if Branch Target Identification Mechanism is enabled.  */
> +bool
> +aarch_bti_enabled (void)
> +{
> +  return aarch_enable_bti == 1;
> +}
>
> See comment in earlier patch about the location of this function
> moving.   Can aarch_enable_bti take values other than 0 and 1?

Yes default is 2.

[...]

> +  return GET_CODE (pat) == UNSPEC_VOLATILE && XINT (pat, 1) ==
> UNSPEC_BTI_NOP;
>
> I'm not sure where this crept in, but UNSPEC and UNSPEC_VOLATILE have
> separate enums in the backend, so UNSPEC_BIT_NOP should really be
> VUNSPEC_BTI_NOP and defined in the enum "unspecv".

Done

> +aarch_pac_insn_p (rtx x)
> +{
> +  if (!x || !INSN_P (x))
> +return false;
> +
> +  rtx pat = PATTERN (x);
> +
> +  if (GET_CODE (pat) == SET)
> +{
> +  rtx tmp = XEXP (pat, 1);
> +  if (tmp
> +   && GET_CODE (tmp) == UNSPEC
> +   && (XINT (tmp, 1) == UNSPEC_PAC_NOP
> +   || XINT (tmp, 1) == UNSPEC_PACBTI_NOP))
> + return true;
> +}
> +
>
> This will also need updating (see review on earlier patch) because
> PACBTI needs to be unspec_volatile, while PAC doesn't.

Done

> +/* The following two functions are for code compatibility with aarch64
> +   code, this even if in arm we have only one bti instruction.  */
> +
>
> I'd just write
>  /* Target specific mapping for aarch_gen_bti_c and
>  aarch_gen_bti_j. For Arm, both of these map to a simple BTI
> instruction.  */

Done

>
> @@ -162,6 +162,7 @@ (define_c_enum "unspec" [
>UNSPEC_PAC_NOP ; Represents PAC signing LR
>UNSPEC_PACBTI_NOP  ; Represents PAC signing LR + valid landing pad
>UNSPEC_AUT_NOP ; Represents PAC verifying LR
> +  UNSPEC_BTI_NOP ; Represent BTI
>  ])
>
> BTI is an unspec volatile, so this should be in the "vunspec" enum and
> renamed accordingly (see above).

Done.

Please find attached the updated version of this patch.

BR

  Andrea

>From 582b5e4e4fe089f6865cc3e0360afd1ff168 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Thu, 7 Apr 2022 11:51:56 +0200
Subject: [PATCH] [PATCH 12/15] arm: implement bti injection

Hi all,

this patch enables Branch Target Identification Armv8.1-M Mechanism
[1].

This is achieved by using the bti pass made common with Aarch64.

The pass iterates through the instructions and adds the necessary BTI
instructions at the beginning of every function and at every landing
pads targeted by indirect jumps.

Best Regards

  Andrea

[1]


gcc/ChangeLog

2022-04-07  Andrea Corallo  

* config.gcc (arm*-*-*): Add 'aarch-bti-insert.o' object.
* config/arm/arm-protos.h: Update.
* config/arm/arm.cc (aarch_bti_enabled) Update.
(aarch_bti_j_insn_p, aarch_pac_insn_p, aarch_gen_bti_c)
(aarch_gen_bti_j): New functions.
* config/arm/arm.md (bti_nop): New insn.
* config/arm/t-arm (PASSES_EXTRA): Add 'arm-passes.def'.
(aarch-bti-insert.o): New target.
* config/arm/unspecs.md (VUNSPEC_BTI_NOP): New unspec.
* config/arm/aarch-bti-insert.cc (rest_of_insert_bti): Verify arch
compatibility.
* config/arm/arm-passes.def: New file.

gcc/testsuite/ChangeLog

2022-04-07  Andrea Corallo  

* gcc.target/arm/bti-1.c: New testcase.
* gcc.target/arm/bti-2.c: Likewise.
---
 gcc/config.gcc   |  2 +-
 gcc/config/arm/arm-passes.def| 21 ++
 gcc/config/arm/arm-protos.h  |  2 +
 gcc/config/arm/arm.cc| 53 -
 gcc/config/arm/arm.md|  7 
 gcc/config/arm/t-arm | 10 +
 gcc/config/arm/unspecs.md|  1 +
 gcc/testsuite/gcc.target/arm/bti-1.c | 12 ++
 gcc/testsuite/gcc.target/arm/bti-2.c | 58 
 9 files changed, 163 insertions(+), 3 deletions(-)
 create mode 100644 gcc/config/arm/arm-passes.def
 create mode 100644 gcc/testsuite/gcc.target/arm/bti-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/bti-2.c

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 86abcd26185..f578b88dd49 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -351,7 +351,7 @@ arc*-*-*)
;;
 arm*-*-*)
cpu_type=arm
-   extra_objs="arm-builtins.o

Re: [PATCH 10/15 V6] arm: Implement cortex-M return signing address codegen

2022-12-14 Thread Richard Earnshaw via Gcc-patches




On 14/12/2022 16:35, Andrea Corallo via Gcc-patches wrote:

Richard Earnshaw  writes:

[...]



+  if (TARGET_TPCS_FRAME)
+error ("Return address signing and %<-mtpcs-frame%> are
incompatible.");

So really this is 'not implemented' rather than not compatible - I
don't see why we couldn't implement this if we really wanted to.  It's
not worth implementing it because tpcs-frames are very much legacy
these days.

So the message should use sorry() and say 'is not supported' rather
than 'are incompatible'.

+(define_insn "pacbti_nop"
+  [(set (reg:SI IP_REGNUM)
+   (unspec:SI [(reg:SI SP_REGNUM) (reg:SI LR_REGNUM)]
+  VUNSPEC_PACBTI_NOP))]

No, this needs to be unspec_volatile, not unspec.

+(define_insn "aut_nop"
+  [(unspec:SI [(reg:SI IP_REGNUM) (reg:SI SP_REGNUM) (reg:SI LR_REGNUM)]
+ VUNSPEC_AUT_NOP)]

Similarly.

R.



Hi Richard & all,

please find attached the updated patch implementing suggestions.

BR

   Andrea


+   (unspec_volatile:SI [(reg:SI SP_REGNUM) (reg:SI LR_REGNUM)]
+  VUNSPEC_PACBTI_NOP))]

Please fix the indentation of the VUNSPEC_...

+  [(unspec_volatile:SI [(reg:SI IP_REGNUM) (reg:SI SP_REGNUM) (reg:SI 
LR_REGNUM)]

+ VUNSPEC_AUT_NOP)]

And here.

Otherwise ok with that change.

R.


Re: [PATCH 12/15 V4] arm: implement bti injection

2022-12-14 Thread Richard Earnshaw via Gcc-patches




On 14/12/2022 16:40, Andrea Corallo via Gcc-patches wrote:

Hi Richard,

thanks for reviewing.

Richard Earnshaw  writes:


On 28/10/2022 17:40, Andrea Corallo via Gcc-patches wrote:

Hi all,
please find attached the third iteration of this patch addresing
review
comments.
Thanks
Andrea



@@ -23374,12 +23374,6 @@ output_probe_stack_range (rtx reg1, rtx reg2)
return "";
  }

-static bool
-aarch_bti_enabled ()
-{
-  return false;
-}
-
  /* Generate the prologue instructions for entry into an ARM or Thumb-2
 function.  */
  void
@@ -32992,6 +32986,61 @@ arm_current_function_pac_enabled_p (void)
&& !crtl->is_leaf));
  }

+/* Return TRUE if Branch Target Identification Mechanism is enabled.  */
+bool
+aarch_bti_enabled (void)
+{
+  return aarch_enable_bti == 1;
+}

See comment in earlier patch about the location of this function
moving.   Can aarch_enable_bti take values other than 0 and 1?


Yes default is 2.


It shouldn't be by this point, because, hopefully you've gone through 
the equivalent of this hunk (from aarch64) somewhere in 
arm_override_options:



   if (aarch_enable_bti == 2)
 {
 #ifdef TARGET_ENABLE_BTI
   aarch_enable_bti = 1;
 #else
   aarch_enable_bti = 0;
 #endif
 }

And after this point the '2' should never be seen again.  We use this 
trick to permit the user to force a default that differs from the 
configuration.


However, I don't see a hunk to do this in patch 3, so perhaps that needs 
updating to fix this.





[...]


+  return GET_CODE (pat) == UNSPEC_VOLATILE && XINT (pat, 1) ==
UNSPEC_BTI_NOP;

I'm not sure where this crept in, but UNSPEC and UNSPEC_VOLATILE have
separate enums in the backend, so UNSPEC_BIT_NOP should really be
VUNSPEC_BTI_NOP and defined in the enum "unspecv".


Done


+aarch_pac_insn_p (rtx x)
+{
+  if (!x || !INSN_P (x))
+return false;
+
+  rtx pat = PATTERN (x);
+
+  if (GET_CODE (pat) == SET)
+{
+  rtx tmp = XEXP (pat, 1);
+  if (tmp
+ && GET_CODE (tmp) == UNSPEC
+ && (XINT (tmp, 1) == UNSPEC_PAC_NOP
+ || XINT (tmp, 1) == UNSPEC_PACBTI_NOP))
+   return true;
+}
+

This will also need updating (see review on earlier patch) because
PACBTI needs to be unspec_volatile, while PAC doesn't.


Done


+/* The following two functions are for code compatibility with aarch64
+   code, this even if in arm we have only one bti instruction.  */
+

I'd just write
  /* Target specific mapping for aarch_gen_bti_c and
  aarch_gen_bti_j. For Arm, both of these map to a simple BTI
instruction.  */


Done



@@ -162,6 +162,7 @@ (define_c_enum "unspec" [
UNSPEC_PAC_NOP  ; Represents PAC signing LR
UNSPEC_PACBTI_NOP   ; Represents PAC signing LR + valid landing pad
UNSPEC_AUT_NOP  ; Represents PAC verifying LR
+  UNSPEC_BTI_NOP   ; Represent BTI
  ])

BTI is an unspec volatile, so this should be in the "vunspec" enum and
renamed accordingly (see above).


Done.

Please find attached the updated version of this patch.

BR

   Andrea



Apart from that, this is OK.

R.


Re: [PATCH 12/15 V4] arm: implement bti injection

2022-12-14 Thread Richard Earnshaw via Gcc-patches




On 14/12/2022 17:00, Richard Earnshaw via Gcc-patches wrote:



On 14/12/2022 16:40, Andrea Corallo via Gcc-patches wrote:

Hi Richard,

thanks for reviewing.

Richard Earnshaw  writes:


On 28/10/2022 17:40, Andrea Corallo via Gcc-patches wrote:

Hi all,
please find attached the third iteration of this patch addresing
review
comments.
Thanks
    Andrea



@@ -23374,12 +23374,6 @@ output_probe_stack_range (rtx reg1, rtx reg2)
    return "";
  }

-static bool
-aarch_bti_enabled ()
-{
-  return false;
-}
-
  /* Generate the prologue instructions for entry into an ARM or Thumb-2
 function.  */
  void
@@ -32992,6 +32986,61 @@ arm_current_function_pac_enabled_p (void)
    && !crtl->is_leaf));
  }

+/* Return TRUE if Branch Target Identification Mechanism is 
enabled.  */

+bool
+aarch_bti_enabled (void)
+{
+  return aarch_enable_bti == 1;
+}

See comment in earlier patch about the location of this function
moving.   Can aarch_enable_bti take values other than 0 and 1?


Yes default is 2.


It shouldn't be by this point, because, hopefully you've gone through 
the equivalent of this hunk (from aarch64) somewhere in 
arm_override_options:



    if (aarch_enable_bti == 2)
  {
  #ifdef TARGET_ENABLE_BTI
    aarch_enable_bti = 1;
  #else
    aarch_enable_bti = 0;
  #endif
  }

And after this point the '2' should never be seen again.  We use this 
trick to permit the user to force a default that differs from the 
configuration.


However, I don't see a hunk to do this in patch 3, so perhaps that needs 
updating to fix this.


I've just remembered that the above is to support a configure-time 
option of the compiler to enable branch protection.  But perhaps we 
don't want to have that in AArch32, in which case it would be better not 
to have the default be 2 anyway, just default to off (0).


R.






[...]


+  return GET_CODE (pat) == UNSPEC_VOLATILE && XINT (pat, 1) ==
UNSPEC_BTI_NOP;

I'm not sure where this crept in, but UNSPEC and UNSPEC_VOLATILE have
separate enums in the backend, so UNSPEC_BIT_NOP should really be
VUNSPEC_BTI_NOP and defined in the enum "unspecv".


Done


+aarch_pac_insn_p (rtx x)
+{
+  if (!x || !INSN_P (x))
+    return false;
+
+  rtx pat = PATTERN (x);
+
+  if (GET_CODE (pat) == SET)
+    {
+  rtx tmp = XEXP (pat, 1);
+  if (tmp
+  && GET_CODE (tmp) == UNSPEC
+  && (XINT (tmp, 1) == UNSPEC_PAC_NOP
+  || XINT (tmp, 1) == UNSPEC_PACBTI_NOP))
+    return true;
+    }
+

This will also need updating (see review on earlier patch) because
PACBTI needs to be unspec_volatile, while PAC doesn't.


Done


+/* The following two functions are for code compatibility with aarch64
+   code, this even if in arm we have only one bti instruction.  */
+

I'd just write
  /* Target specific mapping for aarch_gen_bti_c and
  aarch_gen_bti_j. For Arm, both of these map to a simple BTI
instruction.  */


Done



@@ -162,6 +162,7 @@ (define_c_enum "unspec" [
    UNSPEC_PAC_NOP    ; Represents PAC signing LR
    UNSPEC_PACBTI_NOP    ; Represents PAC signing LR + valid landing pad
    UNSPEC_AUT_NOP    ; Represents PAC verifying LR
+  UNSPEC_BTI_NOP    ; Represent BTI
  ])

BTI is an unspec volatile, so this should be in the "vunspec" enum and
renamed accordingly (see above).


Done.

Please find attached the updated version of this patch.

BR

   Andrea



Apart from that, this is OK.

R.


[PATCH] c++: local alias in typename in lambda [PR105518]

2022-12-14 Thread Patrick Palka via Gcc-patches
We substitute the qualifying scope of a TYPENAME_TYPE directly using
tsubst_aggr_type (so that we can pass entering_scope=true) instead of
going through tsubst, which means we don't properly reuse typedefs
during this substitution.  This ends up causing us to reject the below
testcase because we substitute the TYPENAME_TYPE impl::type as if it
were written without the typedef impl for A, and thus we expect the
non-capturing lambda to capture t.

This patch fixes this by making tsubst_aggr_type delegate typedefs
to tsubst so that get properly reused, and then adjusting the result
appropriately if entering_scope is true.  In passing, this refactors
tsubst_aggr_type into two functions, one that's intended to be called
directly and a more minimal one that's intended to be called only from
the RECORD/UNION/ENUMERAL_TYPE cases of tsubst (and contains only the
necessary bits for that call site).

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?  Patch generated with -w to suppress noisy whitespace changes.

PR c++/105518

gcc/cp/ChangeLog:

* pt.cc (tsubst_aggr_type): Handle typedefs by delegating to
tsubst and adjusting the result if entering_scope.  Split out
the main part of the function into ...
(tsubst_aggr_type_1) ... here.
(tsubst): Use tsubst_aggr_type_1 instead of tsubst_aggr_type.
Handle TYPE_PTRMEMFUNC_P RECORD_TYPEs here instead of in
tsubst_aggr_type_1.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-alias1.C: New test.
---
 gcc/cp/pt.cc  | 58 ++-
 .../g++.dg/cpp0x/lambda/lambda-alias1.C   | 23 
 2 files changed, 65 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-alias1.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 81b7787fd3d..86862e56410 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -185,6 +185,7 @@ static tree tsubst_template_parms (tree, tree, 
tsubst_flags_t);
 static void tsubst_each_template_parm_constraints (tree, tree, tsubst_flags_t);
 tree most_specialized_partial_spec (tree, tsubst_flags_t);
 static tree tsubst_aggr_type (tree, tree, tsubst_flags_t, tree, int);
+static tree tsubst_aggr_type_1 (tree, tree, tsubst_flags_t, tree, int);
 static tree tsubst_arg_types (tree, tree, tree, tsubst_flags_t, tree);
 static tree tsubst_function_type (tree, tree, tsubst_flags_t, tree);
 static bool check_specialization_scope (void);
@@ -13845,23 +13846,49 @@ tsubst_aggr_type (tree t,
   if (t == NULL_TREE)
 return NULL_TREE;
 
-  /* If T is an alias template specialization, we want to substitute that
- rather than strip it, especially if it's dependent_alias_template_spec_p.
- It should be OK not to handle entering_scope in this case, since
- DECL_CONTEXT will never be an alias template specialization.  We only get
- here with an alias when tsubst calls us for TYPENAME_TYPE.  */
-  if (alias_template_specialization_p (t, nt_transparent))
-return tsubst (t, args, complain, in_decl);
+  /* Handle typedefs via tsubst so that they get reused.  */
+  if (typedef_variant_p (t))
+{
+  t = tsubst (t, args, complain, in_decl);
+  if (t == error_mark_node)
+   return error_mark_node;
+
+  /* The effect of entering_scope is that when substitution yields a
+dependent specialization A, lookup_template_class prefers to
+return A's primary template type instead of the implicit instantiation.
+So when entering_scope, we mirror this behavior by inspecting
+TYPE_CANONICAL appropriately, taking advantage of the fact that
+lookup_template_class links the two types by setting TYPE_CANONICAL of
+the latter to the former.  */
+  if (entering_scope
+ && CLASS_TYPE_P (t)
+ && dependent_type_p (t)
+ && TYPE_CANONICAL (t) == TREE_TYPE (TYPE_TI_TEMPLATE (t)))
+   t = TYPE_CANONICAL (t);
+  return t;
+}
 
   switch (TREE_CODE (t))
 {
   case RECORD_TYPE:
-  if (TYPE_PTRMEMFUNC_P (t))
-   return tsubst (TYPE_PTRMEMFUNC_FN_TYPE (t), args, complain, in_decl);
-
-  /* Fall through.  */
   case ENUMERAL_TYPE:
   case UNION_TYPE:
+   return tsubst_aggr_type_1 (t, args, complain, in_decl, entering_scope);
+
+  default:
+   return tsubst (t, args, complain, in_decl);
+}
+}
+
+/* The part of tsubst_aggr_type that's shared with tsubst.  */
+
+static tree
+tsubst_aggr_type_1 (tree t,
+   tree args,
+   tsubst_flags_t complain,
+   tree in_decl,
+   int entering_scope)
+{
   if (TYPE_TEMPLATE_INFO (t) && uses_template_parms (t))
 {
   tree argvec;
@@ -13892,10 +13919,6 @@ tsubst_aggr_type (tree t,
   else
 /* This is not a template type, so there's nothing to do.  */
 return t;
-
-default:
-  return tsubst (t, args, complain, in_decl);
-}
 }
 
 /* Map from a FUNC

Re: [Patch, Fortran] libgfortran's ISO_Fortran_binding.c: Use GCC11 version for backward-only code [PR108056]

2022-12-14 Thread Harald Anlauf via Gcc-patches

Hi Tobias,

thanks for your explanation!  Now things are clearer.

Am 14.12.22 um 08:57 schrieb Tobias Burnus:

@@ -63 +66 @@ cfi_desc_to_gfc_desc (gfc_array_void *d,
-  d->dtype.version = s->version;
+  d->dtype.version = 0;


I was wondering what the significance of "version" is.
In ISO_Fortran_binding.h we seem to always have
   #define CFI_VERSION 1
and it did not change with gcc-12.


The version is 1 for CFI but it is 0 for GFC. However, as we do not
check the GFC version anywhere and it is not publicly exposed, it does
not really matter. Still, "d->dtype.version = 0;" matches what the
compiler itself produces – and for consistency, setting it to 0 is
better than setting it to 1 (via CFI's version field).

Actually 'dtype.version' is not really set anywhere; at least
gfc_get_dtype_rank_type(...) does not set it; zero initialization is
most common but it could be also some random value. In libgfortran,
GFC_DTYPE_CLEAR explicitly sets it to 0.


OK, that was not easy to find.


@@ -107 +117 @@ gfc_desc_to_cfi_desc (CFI_cdesc_t **d_pt
-  d->version = s->dtype.version;
+  d->version = CFI_VERSION;


This treatment of "version" was the equivalent to the above that
confused me.  Assuming we were to change CFI_VERSION in gcc-13+,
is this the right choice here regarding backward compatibility?


I don't think we will change CFI version any time soon as we rather
closely follow the Fortran standard and I do not see any changes which
are required there.

NOTE: As s->dtype.version is either 0 or some random value, setting
version in the CFI / ISO C descriptor to 1, be it as literal or as macro
constant, makes it the same as CFI_VERSION.


OK


And: I don't think we will change CFI_VERSION or the structure of the
CFI array descriptor any time soon; there does not seem to be any need
for it, it matches the Fortran standard one well (and no plans seem to
be planed on that side) and, finally, changing an array descriptor is
painful!


Agreed.


However, using '1;  /* CFI_VERSION in GCC 11 and at time of writing. */'
would also work – but I would expect that we will go through all CFI
users if we ever change the descriptor (and bump the version), possibly
adding version-number dependent code.


Agreed.  Searching for a string that can be guessed is
easier that looking for a magic '1'.


So besides the "version" question ok from my side.


I hope I could answer the latter.


Yes, and thanks for the patch!

Harald


Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 
80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: 
Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; 
Registergericht München, HRB 106955







[PATCH] gcov: annotate uncovered branches [PR107537]

2022-12-14 Thread Michael Förderer via Gcc-patches

Dear all,

this is a patch to print the gcov annotations (fallthrough or throw) als
to uncovered branches.


Best regards,

Michael
From b65cfc8a837cd9d1b6421978865210e59ba62e0e Mon Sep 17 00:00:00 2001
From: Spacetown 
Date: Sun, 4 Dec 2022 21:03:34 +0100
Subject: [PATCH] gcov: annotate uncovered branches [PR107537]
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

gcc/ChangeLog:
PR gcc/107537
* gcov.cc (output_branch_count): Add annotation '(fallthrough)' or 
'(throw)' also to uncovered branches.

Signed-off-by: Michael Förderer 
---
 gcc/gcov.cc | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/gcov.cc b/gcc/gcov.cc
index 9cf1071166f..5314be8a887 100644
--- a/gcc/gcov.cc
+++ b/gcc/gcov.cc
@@ -2893,7 +2893,9 @@ output_branch_count (FILE *gcov_file, int ix, const 
arc_info *arc)
 arc->fall_through ? " (fallthrough)"
 : arc->is_throw ? " (throw)" : "");
   else
-   fnotice (gcov_file, "branch %2d never executed", ix);
+   fnotice (gcov_file, "branch %2d never executed%s", ix,
+ arc->fall_through ? " (fallthrough)"
+: arc->is_throw ? " (throw)" : "");
 
   if (flag_verbose)
fnotice (gcov_file, " (BB %d)", arc->dst->id);
-- 
2.32.1 (Apple Git-133)



[PATCH, V3] PR 107299, GCC does not build on PowerPC when long double is IEEE 128-bit

2022-12-14 Thread Michael Meissner via Gcc-patches
This set of patches was first submitted on November 1st.  Kewen.Lin
 asked for some changes to the first set of patches.  I
also tried to clean up the comments in the second patch about types that Segher
Boessenkool  mentioned.

I had just re-submitted the first patch yesterday, but Segher asked that I
repost all three patches.  Here is the original commentary for all three
patches, tweaked a little bit:

These 3 patches fix the problems with building GCC on PowerPC systems when long
double is configured to use the IEEE 128-bit format.

There are 3 patches in this patch set.  The first two patches are required to
fix the basic problem.  The third patch fixes some issues that were noticed
along the way.

The basic issue is internally within GCC there are several types for 128-bit
floating point.  The types are:

1) The long double type (which use either TFmode for 128-bit long doubles
or possibly DFmode for 64-bit long doubles).  In the normal case, long
double is 128-bits (TFmode) and depending on the configuration options
and the switches passed by the user at compilation time, long double is
either the 128-bit IBM double-double type or IEEE 128-bit.

2)  The type for __ibm128.  If long double is IBM 128-bit double-double,
internally within the compiler, this type is the same as the long
double type.  If long double is either IEEE 128-bit or is 64-bit, then
this type is a separate type.  If long double is not double-double,
this type will use IFmode during RTL.

3)  The type for _Float128.  This type is always IEEE 128-bit if it exists.
While it is a separate internal type, currently if long double is IEEE
128-bit, this type uses TFmode once it gets to RTL, but within Gimple
it is a separate type.  If long double is not IEEE 128-bit, then this
type uses KFmode.  All of the f128 math functions defined by the
compiler use this type.  In the past, _Float128 was a C extended type
and not available in C++.  Now it is a part of the C/C++ 2x standards.

4)  The type for __float128.  The history is I implemented __float128
first, and several releases later, we added _Float128 as a standard C
type.  Unfortunately, I didn't think things through enough when
_Float128 came out.  Like __ibm128, it uses the long double type if
long double is IEEE 128-bit, and now it uses the _Float128 type if long
double is not IEEE 128-bit.  IMHO, this is the major problem.  The two
IEEE 128-bit types should use the same type internally (or at least one
should be a qualified type of the other).  Before we started adding
more support for _Float128, it mostly works, but now it doesn't with
more optimizations being done.

5)  The error occurs in building _mulkc3 in libgcc, when the TFmode type in
the code is defined to use attribute((mode(TF))), but the functions
that are called all have _Float128 arguments.  These are separate
types, and ultimately one of the consistancy checks fails because they
are different types.

There are 3 patches in this set:

1)  The first patch rewrites how the complex 128-bit multiply and divide
functions are done in the compiler.  In the old scheme, essentially
there were only two types ever being used, the long double type, and
the not long double type.  The original code would make the names
called of these functions to be __multc3/__divtc3 or
__mulkc3/__divkc3.  This worked because there were only two types.
With straightening out the types, so __float128/_Float128 is never the
long double type, there are potentially 3-4 types.  However, the C
front end and the middle end code will not let us create two built-in
functions that have the same name.

Patch #1 patch rips out this code, and rewrites it to be cleaner.

In the original version of the patches, I disabled doing the mapping
when building libgcc because it caused problems when building __mulkc3
and __divkc3.  I have removed this check, since the second patch will
allow these functions to be built without disabling the mapping.

2)  The second patch fixes the problem of __float128 and _Float128 not
being the same if long double is IEEE 128-bit.  After this patch, both
_Float128 and __float128 types will always use the same type.  When we
get to RTL, it will always use KFmode type (and not use TFmode).  The
stdc++ library will not build if we use TFmode for these types due to
the other changes.

There is a minor codegen issue that if you explicitly use long double
and call the F128 FMA (fused multiply-add) round to odd functions that
are defined to use __float128/_Float128 arguments.  While we might be
able to optimize these later,

[PATCH 1/3, V3] PR 107299, Rework 128-bit complex multiply and divide

2022-12-14 Thread Michael Meissner via Gcc-patches
This patch reworks how the complex multiply and divide built-in functions are
done.  Previously we created built-in declarations for doing long double complex
multiply and divide when long double is IEEE 128-bit.  The old code also did not
support __ibm128 complex multiply and divide if long double is IEEE 128-bit.

In terms of history, I wrote the original code just as I was starting to test
GCC on systems where IEEE 128-bit long double was the default.  At the time, we
had not yet started mangling the built-in function names as a way to bridge
going from a system with 128-bit IBM long double to 128-bin IEEE long double.

The original code depends on there only being two 128-bit types invovled.  With
the next patch in this series, this assumption will no longer be true.  When
long double is IEEE 128-bit, there will be 2 IEEE 128-bit types (one for the
explicit __float128/_Float128 type and one for long double).

The problem is we cannot create two separate built-in functions that resolve to
the same name.  This is a requirement of add_builtin_function and the C front
end.  That means for the 3 possible modes (IFmode, KFmode, and TFmode), you can
only use 2 of them.

This code does not create the built-in declaration with the changed name.
Instead, it uses the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change the name
before it is written out to the assembler file like it now does for all of the
other long double built-in functions.

When I wrote these patches, I discovered that __ibm128 complex multiply and
divide had originally not been supported if long double is IEEE 128-bit as it
would generate calls to __mulic3 and __divic3.  I added tests in the testsuite
to verify that the correct name (i.e. __multc3 and __divtc3) is used in this
case.

I had previously sent this patch out on November 1st.  Compared to that version,
this version no longer disables the special mapping when you are building
libgcc, as it turns out we don't need it.

I tested all 3 patchs for PR target/107299 on:

1)  LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
2)  LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
3)  LE Power9  using --with-cpu=power9  --with-long-double-format=ibm
4)  BE Power8  using --with-cpu=power8  --with-long-double-format=ibm

Once all 3 patches have been applied, we can once again build GCC when long
double is IEEE 128-bit.  There were no other regressions with these patches.
Can I check these patches into the trunk?

2022-12-14   Michael Meissner  

gcc/

PR target/107299
* config/rs6000/rs6000.cc (create_complex_muldiv): Delete.
(init_float128_ieee): Delete code to switch complex multiply and divide
for long double.
(complex_multiply_builtin_code): New helper function.
(complex_divide_builtin_code): Likewise.
(rs6000_mangle_decl_assembler_name): Add support for mangling the name
of complex 128-bit multiply and divide built-in functions.

gcc/testsuite/

PR target/107299
* gcc.target/powerpc/divic3-1.c: New test.
* gcc.target/powerpc/divic3-2.c: Likewise.
* gcc.target/powerpc/mulic3-1.c: Likewise.
* gcc.target/powerpc/mulic3-2.c: Likewise.
---
 gcc/config/rs6000/rs6000.cc | 109 +++-
 gcc/testsuite/gcc.target/powerpc/divic3-1.c |  18 
 gcc/testsuite/gcc.target/powerpc/divic3-2.c |  17 +++
 gcc/testsuite/gcc.target/powerpc/mulic3-1.c |  18 
 gcc/testsuite/gcc.target/powerpc/mulic3-2.c |  17 +++
 5 files changed, 132 insertions(+), 47 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/divic3-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/mulic3-2.c

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index b3a609f3aa3..6d08f6ed1fb 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -11101,26 +11101,6 @@ init_float128_ibm (machine_mode mode)
 }
 }
 
-/* Create a decl for either complex long double multiply or complex long double
-   divide when long double is IEEE 128-bit floating point.  We can't use
-   __multc3 and __divtc3 because the original long double using IBM extended
-   double used those names.  The complex multiply/divide functions are encoded
-   as builtin functions with a complex result and 4 scalar inputs.  */
-
-static void
-create_complex_muldiv (const char *name, built_in_function fncode, tree fntype)
-{
-  tree fndecl = add_builtin_function (name, fntype, fncode, BUILT_IN_NORMAL,
- name, NULL_TREE);
-
-  set_builtin_decl (fncode, fndecl, true);
-
-  if (TARGET_DEBUG_BUILTIN)
-fprintf (stderr, "create complex %s, fncode: %d\n", name, (int) fncode);
-
-  return;
-}
-
 /* Set up IEEE 128-bit floating point routines.  Use different names if the
arguments can 

[PATCH 2/3, V3] PR 107299, Make __float128 use the _Float128 type

2022-12-14 Thread Michael Meissner via Gcc-patches
This patch fixes the issue that GCC cannot build when the default long double
is IEEE 128-bit.  It fails in building libgcc, specifically when it is trying
to buld the __mulkc3 function in libgcc.  It is failing in gimple-range-fold.cc
during the evrp pass.  Ultimately it is failing because the code declared the
internal type for one IEEE 128-bit floating point type, and NaN functions use a
different IEEE 128-bit floating point type.

Gimple-range-fold uses the internal types, but there are similar problems when
the code is converted to RTL and the two different modes (KFmode, TFmode) are
used.

typedef float TFtype __attribute__((mode (TF)));
typedef __complex float TCtype __attribute__((mode (TC)));

TCtype
__mulkc3_sw (TFtype a, TFtype b, TFtype c, TFtype d)
{
  TFtype ac, bd, ad, bc, x, y;
  TCtype res;

  ac = a * c;
  bd = b * d;
  ad = a * d;
  bc = b * c;

  x = ac - bd;
  y = ad + bc;

  if (__builtin_isnan (x) && __builtin_isnan (y))
{
  _Bool recalc = 0;
  if (__builtin_isinf (a) || __builtin_isinf (b))
{

  a = __builtin_copysignf128 (__builtin_isinf (a) ? 1 : 0, a);
  b = __builtin_copysignf128 (__builtin_isinf (b) ? 1 : 0, b);
  if (__builtin_isnan (c))
c = __builtin_copysignf128 (0, c);
  if (__builtin_isnan (d))
d = __builtin_copysignf128 (0, d);
  recalc = 1;
}
  if (__builtin_isinf (c) || __builtin_isinf (d))
{

  c = __builtin_copysignf128 (__builtin_isinf (c) ? 1 : 0, c);
  d = __builtin_copysignf128 (__builtin_isinf (d) ? 1 : 0, d);
  if (__builtin_isnan (a))
a = __builtin_copysignf128 (0, a);
  if (__builtin_isnan (b))
b = __builtin_copysignf128 (0, b);
  recalc = 1;
}
  if (!recalc
  && (__builtin_isinf (ac) || __builtin_isinf (bd)
  || __builtin_isinf (ad) || __builtin_isinf (bc)))
{

  if (__builtin_isnan (a))
a = __builtin_copysignf128 (0, a);
  if (__builtin_isnan (b))
b = __builtin_copysignf128 (0, b);
  if (__builtin_isnan (c))
c = __builtin_copysignf128 (0, c);
  if (__builtin_isnan (d))
d = __builtin_copysignf128 (0, d);
  recalc = 1;
}
  if (recalc)
{
  x = __builtin_inff128 () * (a * c - b * d);
  y = __builtin_inff128 () * (a * d + b * c);
}
}

  __real__ res = x;
  __imag__ res = y;
  return res;
}

Currently GCC uses the long double type node for __float128 if long double is
IEEE 128-bit.  It did not use the node for _Float128.

Originally this was noticed if you call the nansq function to make a signaling
NaN (nansq is mapped to nansf128).  Because the type node for _Float128 is
different from __float128, the machine independent code converts signaling NaNs
to quiet NaNs if the types are not compatible.  The following tests used to
fail when run on a system where long double is IEEE 128-bit:

gcc.dg/torture/float128-nan.c
gcc.target/powerpc/nan128-1.c

This patch makes both __float128 and _Float128 use the same type node.

One side effect of not using the long double type node for __float128 is that we
must only use KFmode for _Float128/__float128.  The libstdc++ library won't
build if we use TFmode for _Float128 and __float128 when long double is IEEE
128-bit.

Another minor side effect is that the f128 round to odd fused multiply-add
function will not merge negatition with the FMA operation when the type is long
double.  If the type is __float128 or _Float128, then it will continue to do the
optimization.  The round to odd functions are defined in terms of __float128
arguments.  For example:

long double
do_fms (long double a, long double b, long double c)
{
return __builtin_fmaf128_round_to_odd (a, b, -c);
}

will generate (assuming -mabi=ieeelongdouble):

xsnegqp 4,4
xsmaddqpo 4,2,3
xxlor 34,36,36

while:

__float128
do_fms (__float128 a, __float128 b, __float128 c)
{
return __builtin_fmaf128_round_to_odd (a, b, -c);
}

will generate:

xsmsubqpo 4,2,3
xxlor 34,36,36

Assuming this patch goes in, we can open a bug about the above optimizations not
working.  However, given that the functions are explicitly documented to use
__float128 types, and the code in the test is using long double, I don't think
it is a high prio

[PATCH 3/3, V3] PR 107299, Update float 128-bit conversion

2022-12-14 Thread Michael Meissner via Gcc-patches
This patch fixes two tests that are still failing when long double is IEEE
128-bit after the previous 2 patches for PR target/107299 have been applied.
The tests are:

gcc.target/powerpc/convert-fp-128.c
gcc.target/powerpc/pr85657-3.c

This patch is a rewrite of the patch submitted on August 18th:

| https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599988.html

This patch reworks the conversions between 128-bit binary floating point types.
Previously, we would call rs6000_expand_float128_convert to do all conversions.
Now, we only define the conversions between the same representation that turn
into a NOP.  The appropriate extend or truncate insn is generated, and after
register allocation, it is converted to a move.

This patch also fixes two places where we want to override the external name
for the conversion function, and the wrong optab was used.  Previously,
rs6000_expand_float128_convert would handle the move or generate the call as
needed.  Now, it lets the machine independent code generate the call.  But if
we use the machine independent code to generate the call, we need to update the
name for two optabs where a truncate would be used in terms of converting
between the modes.  This patch updates those two optabs.

I tested this patch on:

1)  LE Power10 using --with-cpu=power10 --with-long-double-format=ieee
2)  LE Power10 using --with-cpu=power10 --with-long-double-format=ibm
3)  LE Power9  using --with-cpu=power9  --with-long-double-format=ibm
4)  BE Power8  using --with-cpu=power8  --with-long-double-format=ibm

In the past I have also tested this exact patch on the following systems:

1)  LE Power10 using --with-cpu=power9  --with-long-double-format=ibm
2)  LE Power10 using --with-cpu=power8  --with-long-double-format=ibm
3)  LE Power10 using --with-cpu=power10 --with-long-double-format=ibm

There were no regressions in the bootstrap process or running the tests (after
applying all 3 patches for PR target/107299).  Can I check this patch into the
trunk?

2022-12-14   Michael Meissner  

gcc/

PR target/107299
* config/rs6000/rs6000.cc (init_float128_ieee): Use the correct
float_extend or float_truncate optab based on how the machine converts
between IEEE 128-bit and IBM 128-bit.
* config/rs6000/rs6000.md (IFKF): Delete.
(IFKF_reg): Delete.
(extendiftf2): Rewrite to be a move if IFmode and TFmode are both IBM
128-bit.  Do not run if TFmode is IEEE 128-bit.
(extendifkf2): Delete.
(extendtfkf2): Delete.
(extendtfif2): Delete.
(trunciftf2): Delete.
(truncifkf2): Delete.
(trunckftf2): Delete.
(extendkftf2): Implement conversion of IEEE 128-bit types as a move.
(trunctfif2): Delete.
(trunctfkf2): Implement conversion of IEEE 128-bit types as a move.
(extendtf2_internal): Delete.
(extendtf2_internal): Delete.
---
 gcc/config/rs6000/rs6000.cc |   4 +-
 gcc/config/rs6000/rs6000.md | 177 ++--
 2 files changed, 50 insertions(+), 131 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 604f6a9ce33..0a20bfc8421 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -11134,11 +11134,11 @@ init_float128_ieee (machine_mode mode)
   set_conv_libfunc (trunc_optab, SFmode, mode, "__trunckfsf2");
   set_conv_libfunc (trunc_optab, DFmode, mode, "__trunckfdf2");
 
-  set_conv_libfunc (sext_optab, mode, IFmode, "__trunctfkf2");
+  set_conv_libfunc (trunc_optab, mode, IFmode, "__trunctfkf2");
   if (mode != TFmode && FLOAT128_IBM_P (TFmode))
set_conv_libfunc (sext_optab, mode, TFmode, "__trunctfkf2");
 
-  set_conv_libfunc (trunc_optab, IFmode, mode, "__extendkftf2");
+  set_conv_libfunc (sext_optab, IFmode, mode, "__extendkftf2");
   if (mode != TFmode && FLOAT128_IBM_P (TFmode))
set_conv_libfunc (trunc_optab, TFmode, mode, "__extendkftf2");
 
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6011f5bf76a..799af3c3ebe 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -543,12 +543,6 @@ (define_mode_iterator FMOVE128_GPR [TI
 ; Iterator for 128-bit VSX types for pack/unpack
 (define_mode_iterator FMOVE128_VSX [V1TI KF])
 
-; Iterators for converting to/from TFmode
-(define_mode_iterator IFKF [IF KF])
-
-; Constraints for moving IF/KFmode.
-(define_mode_attr IFKF_reg [(IF "d") (KF "wa")])
-
 ; Whether a floating point move is ok, don't allow SD without hardware FP
 (define_mode_attr fmove_ok [(SF "")
(DF "")
@@ -9096,106 +9090,65 @@ (define_insn "*ieee_128bit_vsx_nabs2_internal"
   "xxlor %x0,%x1,%x2"
   [(set_attr "type" "veclogical")])
 
-;; Float128 conversion functions.  These expand to library function calls.
-;; We use expand to convert from IBM double double to IEEE 128-bit
-;; and trunc for the

[committed] analyzer: don't call binding_key::make on empty regions [PR108065]

2022-12-14 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-4710-g41faa1d7beb90b.

gcc/analyzer/ChangeLog:
PR analyzer/108065
* region.cc (decl_region::get_svalue_for_initializer): Bail out to
avoid calling binding_key::make with an empty region.
* store.cc (binding_map::apply_ctor_val_to_range): Likewise.
(binding_map::apply_ctor_pair_to_child_region): Likewise.
(binding_cluster::bind): Likewise.
(binding_cluster::purge_region): Likewise.
(binding_cluster::maybe_get_compound_binding): Likewise.
(binding_cluster::maybe_get_simple_value): Likewise.

gcc/testsuite/ChangeLog:
PR analyzer/108065
* gfortran.dg/analyzer/pr108065.f90: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region.cc  |  3 +++
 gcc/analyzer/store.cc   | 14 ++
 gcc/testsuite/gfortran.dg/analyzer/pr108065.f90 | 17 +
 3 files changed, 34 insertions(+)
 create mode 100644 gcc/testsuite/gfortran.dg/analyzer/pr108065.f90

diff --git a/gcc/analyzer/region.cc b/gcc/analyzer/region.cc
index 67ba9486980..83809d6e1c3 100644
--- a/gcc/analyzer/region.cc
+++ b/gcc/analyzer/region.cc
@@ -1208,6 +1208,9 @@ decl_region::get_svalue_for_initializer 
(region_model_manager *mgr) const
   if (DECL_EXTERNAL (m_decl))
return NULL;
 
+  if (empty_p ())
+   return NULL;
+
   /* Implicit initialization to zero; use a compound_svalue for it.
 Doing so requires that we have a concrete binding for this region,
 which can fail if we have a region with unknown size
diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index dd8ebaa7374..f3b500c50a0 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -911,6 +911,8 @@ binding_map::apply_ctor_val_to_range (const region 
*parent_reg,
 return false;
   bit_offset_t start_bit_offset = min_offset.get_bit_offset ();
   store_manager *smgr = mgr->get_store_manager ();
+  if (max_element->empty_p ())
+return false;
   const binding_key *max_element_key = binding_key::make (smgr, max_element);
   if (max_element_key->symbolic_p ())
 return false;
@@ -950,6 +952,8 @@ binding_map::apply_ctor_pair_to_child_region (const region 
*parent_reg,
   else
 {
   const svalue *sval = get_svalue_for_ctor_val (val, mgr);
+  if (child_reg->empty_p ())
+   return false;
   const binding_key *k
= binding_key::make (mgr->get_store_manager (), child_reg);
   /* Handle the case where we have an unknown size for child_reg
@@ -1347,6 +1351,8 @@ binding_cluster::bind (store_manager *mgr,
   return;
 }
 
+  if (reg->empty_p ())
+return;
   const binding_key *binding = binding_key::make (mgr, reg);
   bind_key (binding, sval);
 }
@@ -1419,6 +1425,8 @@ void
 binding_cluster::purge_region (store_manager *mgr, const region *reg)
 {
   gcc_assert (reg->get_kind () == RK_DECL);
+  if (reg->empty_p ())
+return;
   const binding_key *binding
 = binding_key::make (mgr, const_cast (reg));
   m_map.remove (binding);
@@ -1666,6 +1674,9 @@ binding_cluster::maybe_get_compound_binding 
(store_manager *mgr,
   if (reg_offset.symbolic_p ())
 return NULL;
 
+  if (reg->empty_p ())
+return NULL;
+
   region_model_manager *sval_mgr = mgr->get_svalue_manager ();
 
   /* We will a build the result map in two parts:
@@ -2162,6 +2173,9 @@ binding_cluster::maybe_get_simple_value (store_manager 
*mgr) const
   if (m_map.elements () != 1)
 return NULL;
 
+  if (m_base_region->empty_p ())
+return NULL;
+
   const binding_key *key = binding_key::make (mgr, m_base_region);
   return get_any_value (key);
 }
diff --git a/gcc/testsuite/gfortran.dg/analyzer/pr108065.f90 
b/gcc/testsuite/gfortran.dg/analyzer/pr108065.f90
new file mode 100644
index 000..86ba4d4f9aa
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/analyzer/pr108065.f90
@@ -0,0 +1,17 @@
+! { dg-do compile }
+! { dg-additional-options "-fcheck=bounds -Wno-analyzer-malloc-leak" }
+! Copy of gfortran.dg/bounds_check_23.f90
+! as a regression test for ICE with -fanalyzer (PR analyzer/108065)
+
+program test
+  implicit none
+  call sub('Lorem ipsum')
+contains
+  subroutine sub( text )
+character(len=*), intent(in)  :: text
+character(len=1), allocatable :: c(:)
+integer :: i
+c = [ ( text(i:i), i = 1, len(text) ) ]
+if (c(1) /= 'L') stop 1
+  end subroutine sub
+end program test
-- 
2.26.3



Re: [PATCH] rs6000: Fix some issues related to Power10 fusion [PR104024]

2022-12-14 Thread Segher Boessenkool
On Wed, Nov 30, 2022 at 04:30:13PM +0800, Kewen.Lin wrote:
> As PR104024 shows, the option -mpower10-fusion isn't guarded by
> -mcpu=power10, it causes compiler to fuse for some patterns
> even without power10 support and then causes ICE unexpectedly,
> this patch is to simply unmask it without power10 support, not
> emit any warnings as this option is undocumented.

Yes, it mostly exists for debugging purposes (and also for testcase).

> Besides, for some define_insns in fusion.md which use constraint
> v, it requires the condition VECTOR_UNIT_ALTIVEC_OR_VSX_P
> (mode), otherwise it can cause ICE in reload, see test
> case pr104024-2.c.

Please don't two separate things in one patch.  It makes bisecting
harder than necessary, and perhaps more interesting to you: it makes
writing good changelog entries and commit messages harder.

> --- a/gcc/config/rs6000/genfusion.pl
> +++ b/gcc/config/rs6000/genfusion.pl
> @@ -167,7 +167,7 @@ sub gen_logical_addsubf
>   $inner_comp, $inner_inv, $inner_rtl, $inner_op, $both_commute, $c4,
>   $bc, $inner_arg0, $inner_arg1, $inner_exp, $outer_arg2, $outer_exp,
>   $ftype, $insn, $is_subf, $is_rsubf, $outer_32, $outer_42,$outer_name,
> - $fuse_type);
> + $fuse_type, $constraint_cond);
>KIND: foreach $kind ('scalar','vector') {
>@outer_ops = @logicals;
>if ( $kind eq 'vector' ) {
> @@ -176,12 +176,14 @@ sub gen_logical_addsubf
> $pred = "altivec_register_operand";
> $constraint = "v";
> $fuse_type = "fused_vector";
> +   $constraint_cond = "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && ";
>} else {
> $vchr = "";
> $mode = "GPR";
> $pred = "gpc_reg_operand";
> $constraint = "r";
> $fuse_type = "fused_arith_logical";
> +   $constraint_cond = "";
> push (@outer_ops, @addsub);
> push (@outer_ops, ( "rsubf" ));
>}

I don't like this at all.  Please use the "isa" attribute where needed?
Or do you need more in some cases?  But, again, separate patch.

> +  if (TARGET_POWER10
> +  && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION) == 0)
> +rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
> +  else if (!TARGET_POWER10 && TARGET_P10_FUSION)
> +rs6000_isa_flags &= ~OPTION_MASK_P10_FUSION;

That's not right.  If you want something like this you should check for
TARGET_POWER10 whenever you check for TARGET_P10_FUSION; but there
really is no reason at all to disable P10 fusion on other CPUs (neither
newer nor older!).

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr104024-1.c
> @@ -0,0 +1,16 @@
> +/* { dg-require-effective-target int128 } */
> +/* { dg-options "-O1 -mdejagnu-cpu=power6 -mpower10-fusion" } */

Does this need -O1?  If not, use -O2 please; if so, document it.


Segher


Make '-frust-incomplete-and-experimental-compiler-do-not-use' a 'Common' option (was: Rust front-end patches v4)

2022-12-14 Thread Thomas Schwinge
Hi!

On 2022-12-13T14:40:36+0100, Arthur Cohen  wrote:
> We've also added one more commit, which only affects files inside the
> Rust front-end folder. This commit adds an experimental flag, which
> blocks the compilation of Rust code when not used.

(That's commit r13-4675-gb07ef39ffbf4e77a586605019c64e2e070915ac3
"gccrs: Add fatal_error when experimental flag is not present".)

I noticed that GCC/Rust recently lost all LTO variants in torture
testing -- due to this commit.  :-O

OK to push the attached
"Make '-frust-incomplete-and-experimental-compiler-do-not-use' a 'Common' 
option",
or should this be done differently?

With that, we get back:

 PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O0  (test for 
excess errors)
 PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O1  (test for 
excess errors)
 PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2  (test for 
excess errors)
+PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
+PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
 PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O3 -g  (test 
for excess errors)
 PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -Os  (test for 
excess errors)

Etc., and in total:

=== rust Summary for unix ===

# of expected passes[-4990-]{+6718+}
# of expected failures  [-39-]{+51+}


Grüße
 Thomas


> We hope this helps
> indicate to users that the compiler is not yet ready, but can still be
> experimented with :)
>
> We plan on removing that flag as soon as possible, but in the meantime,
> we think it will help not creating divide within the Rust ecosystem, as
> well as not waste Rust crate maintainers' time.


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 3b2a8a4df1637a0cad738165a2afa9b34e286fcf Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 14 Dec 2022 17:16:42 +0100
Subject: [PATCH] Make '-frust-incomplete-and-experimental-compiler-do-not-use'
 a 'Common' option

I noticed that GCC/Rust recently lost all LTO variants in torture testing:

 PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O0  (test for excess errors)
 PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O1  (test for excess errors)
 PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2  (test for excess errors)
-PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
-PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
 PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O3 -g  (test for excess errors)
 PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -Os  (test for excess errors)

Etc.

The reason is that when probing for availability of LTO, we run into:

spawn [...]/build-gcc/gcc/testsuite/rust/../../gccrs -B[...]/build-gcc/gcc/testsuite/rust/../../ -fdiagnostics-plain-output -frust-incomplete-and-experimental-compiler-do-not-use -flto -c -o lto8274.o lto8274.c
cc1: warning: command-line option '-frust-incomplete-and-experimental-compiler-do-not-use' is valid for Rust but not for C

For GCC/Rust testing, this flag is defaulted in
'gcc/testsuite/lib/rust.exp:rust_init':

lappend ALWAYS_RUSTFLAGS "additional_flags=-frust-incomplete-and-experimental-compiler-do-not-use"

Make it generally accepted without "is valid for Rust but not for [...]"
diagnostic.

	gcc/rust/
	* lang.opt
	(-frust-incomplete-and-experimental-compiler-do-not-use): Remove.
	gcc/
	* common.opt
	(-frust-incomplete-and-experimental-compiler-do-not-use): New.
---
 gcc/common.opt| 6 ++
 gcc/rust/lang.opt | 4 
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 562d73d7f552..eba28e650f94 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2552,6 +2552,12 @@ frounding-math
 Common Var(flag_rounding_math) Optimization SetByCombined
 Disable optimizations that assume default FP rounding behavior.
 
+; This option applies to Rust only, but is defined 'Common' here, so that it's
+; generally accepted without "is valid for Rust but not for [...]" diagnostic.
+frust-incomplete-and-experimental-compiler-do-not-use
+Common Var(flag_rust_experimental)
+Enable experimental compilation of Rust files at your own risk.
+
 fsched-interblock
 Common Var(flag_schedule_interblock) Init(1) Optimization
 Enable scheduling across basic blocks.
diff --gi

Re: [PATCH 19/56] Revert "Move void_list_node init to common code". (8ff2a92a0450243e52d3299a13b30f208bafa7e0)

2022-12-14 Thread Zopolis0 via Gcc-patches
I had a look-- issue fixed, rough patch below. Full patch will be part of v2.

>From b0d93d8212328fabcbdb32c266c265a4eed49e00 Mon Sep 17 00:00:00 2001
From: Maximilian Downey Twiss 
Date: Thu, 15 Dec 2022 09:54:36 +1100
Subject: [PATCH] java: Adjustments to end_params_node and void_list_node.

gcc/java/ChangeLog:

* builtins.cc (initialize_builtins): Do not set void_list_node to
end_params_node.
* decl.cc (java_init_decl_processing): Set end_params_node to void_list_node.
---
 gcc/java/builtins.cc | 2 --
 gcc/java/decl.cc | 2 +-
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/gcc/java/builtins.cc b/gcc/java/builtins.cc
index 45d736a0d7b8..a882a5c4d521 100644
--- a/gcc/java/builtins.cc
+++ b/gcc/java/builtins.cc
@@ -499,8 +499,6 @@ initialize_builtins (void)
   java_builtins[i].method_name.t = m;
 }

-  void_list_node = end_params_node;
-
   float_ftype_float_float
 = build_function_type_list (float_type_node,
  float_type_node, float_type_node, NULL_TREE);
diff --git a/gcc/java/decl.cc b/gcc/java/decl.cc
index 6319d1ce18a0..018003104ced 100644
--- a/gcc/java/decl.cc
+++ b/gcc/java/decl.cc
@@ -957,7 +957,7 @@ java_init_decl_processing (void)
   build_decl (BUILTINS_LOCATION,
   TYPE_DECL, get_identifier ("Method"), method_type_node);

-  end_params_node = tree_cons (NULL_TREE, void_type_node, NULL_TREE);
+  end_params_node = void_list_node;

   t = build_function_type_list (ptr_type_node, class_ptr_type, NULL_TREE);
   alloc_object_node = add_builtin_function ("_Jv_AllocObject", t,


Re: Java front-end and library patches.

2022-12-14 Thread Zopolis0 via Gcc-patches
Patch 19 has been resolved.


[PATCH] c++: partial ordering with memfn pointer cst [PR108104]

2022-12-14 Thread Patrick Palka via Gcc-patches
Here we're triggering an overzealous assert in unify during partial
ordering since the member function pointer constants are represented as
ordinary CONSTRUCTORs (with TYPE_PTRMEMFUNC_P TREE_TYPE) but the assert
expects only COMPOUND_LITERAL_P constructors.

Bootstrapped and regtested on x86_64-pc-linux, does this look OK for
trunk and perhaps 12?

PR c++/108104

gcc/cp/ChangeLog:

* pt.cc (unify) : Relax assert to accept any
CONSTRUCTOR not just COMPOUND_LITERAL_P ones.

gcc/testsuite/ChangeLog:

* g++.dg/template/ptrmem33.C: New test.
---
 gcc/cp/pt.cc |  2 +-
 gcc/testsuite/g++.dg/template/ptrmem33.C | 30 
 2 files changed, 31 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/template/ptrmem33.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 2f0f7a39698..44058d30799 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -24921,7 +24921,7 @@ unify (tree tparms, tree targs, tree parm, tree arg, 
int strict,
   if (is_overloaded_fn (parm) || type_unknown_p (parm))
return unify_success (explain_p);
   gcc_assert (EXPR_P (parm)
- || COMPOUND_LITERAL_P (parm)
+ || TREE_CODE (parm) == CONSTRUCTOR
  || TREE_CODE (parm) == TRAIT_EXPR);
 expr:
   /* We must be looking at an expression.  This can happen with
diff --git a/gcc/testsuite/g++.dg/template/ptrmem33.C 
b/gcc/testsuite/g++.dg/template/ptrmem33.C
new file mode 100644
index 000..dca741ae5e2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/ptrmem33.C
@@ -0,0 +1,30 @@
+// PR c++/108104
+// { dg-do compile { target c++11 } }
+
+struct A {
+  void x();
+  void y();
+};
+
+enum State { On };
+
+template
+struct B {
+  static void f();
+};
+
+template
+struct B {
+  static void g();
+};
+
+template
+struct B {
+  static void h();
+};
+
+int main() {
+  B::f();
+  B::g();
+  B::h();
+}
-- 
2.39.0.56.g57e2c6ebbe



Re: Java front-end and library patches.

2022-12-14 Thread Zopolis0 via Gcc-patches
Re: 1dedc12d186a110854537e1279b4e6c29f2df35a breakage

I've done some more research, it's not the -dumpbase. Comparing the
java frontend on master as opposed to one based on a commit right
before 1dedc12, the master has '-dumpdir' '.libs/jv-convert-' while
the one before does not, which I believe is causing the breakage.

Master:
zopolis4@epidural ~/g/x/libjava> /bin/bash ./libtool --tag=GCJ
--mode=link /home/zopolis4/gcjbuild/./gcc/gcj
-B/home/zopolis4/gcjbuild/x86_64-pc-linux-gnu/libjava/ -B/h
ome/zopolis4/gcjbuild/./gcc/ -B/usr/local/x86_64-pc-linux-gnu/bin/
-B/usr/local/x86_64-pc-linux-gnu/lib/ -isystem
/usr/local/x86_64-pc-linux-gnu/include -isystem /usr/local
/x86_64-pc-linux-gnu/sys-include
-L/home/zopolis4/gcjbuild/x86_64-pc-linux-gnu/libjava
-fomit-frame-pointer -Usun -v -g -O2  -o jv-convert
--main=gnu.gcj.convert.Convert
 -rpath /usr/local/lib/../lib64 -shared-libgcc
-L/home/zopolis4/gcjbuild/x86_64-pc-linux-gnu/libjava/.libs libgcj.la
libtool: link: /home/zopolis4/gcjbuild/./gcc/gcj
-B/home/zopolis4/gcjbuild/x86_64-pc-linux-gnu/libjava/
-B/home/zopolis4/gcjbuild/./gcc/ -B/usr/local/x86_64-pc-linux-gnu/bin/
-B/usr/local/x86_64-pc-linux-gnu/lib/ -isystem
/usr/local/x86_64-pc-linux-gnu/include -isystem
/usr/local/x86_64-pc-linux-gnu/sys-include -fomit-frame-pointer -Usun
-v -g -O2 -o .libs/jv-convert --main=gnu.gcj.convert.Convert
-shared-libgcc
-L/home/zopolis4/gcjbuild/x86_64-pc-linux-gnu/libjava/.libs
-L/home/zopolis4/gcjbuild/x86_64-pc-linux-gnu/libjava
./.libs/libgcj.so -lpthread -lrt -lltdl -lgc -Wl,-rpath
-Wl,/usr/local/lib/../lib64
Reading specs from /home/zopolis4/gcjbuild/./gcc/specs
Reading specs from
/home/zopolis4/gcjbuild/x86_64-pc-linux-gnu/libjava/libgcj.spec
rename spec startfile to startfileorig
rename spec lib to liborig
COLLECT_GCC=/home/zopolis4/gcjbuild/./gcc/gcj
COLLECT_LTO_WRAPPER=/home/zopolis4/gcjbuild/./gcc/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /home/zopolis4/gcjbuild/../gcj/configure
--disable-bootstrap --disable-multilib --disable-libstdcxx
--disable-libquadmath --disable-libgomp --disable-libgfortran
--enable-languages=java : (reconfigured)
/home/zopolis4/gcjbuild/../gcj/configure --disable-bootstrap
--disable-multilib --disable-libstdcxx --disable-libquadmath
--disable-libgomp --disable-libgfortran --disable-libgm2
--enable-languages=java : (reconfigured)
/home/zopolis4/gcjbuild/../gcj/configure --disable-bootstrap
--disable-multilib --disable-libstdcxx --disable-libquadmath
--disable-libgomp --disable-libgfortran --disable-libgm2
LDFLAGS='-static-libstdc++ -static-libgcc
-L/usr/lib/gcc/x86_64-linux-gnu/12' --enable-languages=c,c++,java,lto
--no-create --no-recursion
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 13.0.0 20221214 (experimental) (GCC)
COLLECT_GCC_OPTIONS='-B'
'/home/zopolis4/gcjbuild/x86_64-pc-linux-gnu/libjava/' '-B'
'/home/zopolis4/gcjbuild/./gcc/' '-B'
'/usr/local/x86_64-pc-linux-gnu/bin/' '-B'
'/usr/local/x86_64-pc-linux-gnu/lib/' '-isystem'
'/usr/local/x86_64-pc-linux-gnu/include' '-isystem'
'/usr/local/x86_64-pc-linux-gnu/sys-include' '-fomit-frame-pointer'
'-U' 'sun' '-v' '-g' '-O2' '-o' '.libs/jv-convert' '-shared-libgcc'
'-L/home/zopolis4/gcjbuild/x86_64-pc-linux-gnu/libjava/.libs'
'-L/home/zopolis4/gcjbuild/x86_64-pc-linux-gnu/libjava'
'-fbootclasspath=./:/usr/local/share/java/libgcj-13.0.0.jar'
'-specs=/home/zopolis4/gcjbuild/x86_64-pc-linux-gnu/libjava/libgcj.spec'
'-shared-libgcc' '-mtune=generic' '-march=x86-64' '-dumpdir'
'.libs/jv-convert-' //here is the big differentiation point
 /home/zopolis4/gcjbuild/./gcc/jvgenmain
.libs/jv-convert-gnu.gcj.convert.Convertmain /tmp/ccomuP4d.i
COLLECT_GCC_OPTIONS='-B'
'/home/zopolis4/gcjbuild/x86_64-pc-linux-gnu/libjava/' '-B'
'/home/zopolis4/gcjbuild/./gcc/' '-B'
'/usr/local/x86_64-pc-linux-gnu/bin/' '-B'
'/usr/local/x86_64-pc-linux-gnu/lib/' '-isystem'
'/usr/local/x86_64-pc-linux-gnu/include' '-isystem'
'/usr/local/x86_64-pc-linux-gnu/sys-include' '-fomit-frame-pointer'
'-U' 'sun' '-v' '-g' '-O2' '-o' '.libs/jv-convert' '-shared-libgcc'
'-L/home/zopolis4/gcjbuild/x86_64-pc-linux-gnu/libjava/.libs'
'-L/home/zopolis4/gcjbuild/x86_64-pc-linux-gnu/libjava'
'-specs=/home/zopolis4/gcjbuild/x86_64-pc-linux-gnu/libjava/libgcj.spec'
'-shared-libgcc' '-mtune=generic' '-march=x86-64' '-dumpdir'
'.libs/jv-convert-'
 /home/zopolis4/gcj

Re: [PATCH V6] rs6000: Optimize cmp on rotated 16bits constant

2022-12-14 Thread Jiufu Guo via Gcc-patches
Hi,

Jiufu Guo via Gcc-patches  writes:

> Hi,
>
> Segher Boessenkool  writes:
>
>> Hi!
>>
>> Sorry for the tardiness.
>>
>> On Mon, Aug 29, 2022 at 11:42:16AM +0800, Jiufu Guo wrote:
>>> When checking eq/ne with a constant which has only 16bits, it can be
>>> optimized to check the rotated data.  By this, the constant building
>>> is optimized.
>>> 
>>> As the example in PR103743:
>>> For "in == 0x8000LL", this patch generates:
>>> rotldi %r3,%r3,16
>>> cmpldi %cr0,%r3,32768
>>> instead:
>>> li %r9,-1
>>> rldicr %r9,%r9,0,0
>>> cmpd %cr0,%r3,%r9
>>
>> FWIW, I find the winnt assembler syntax very hard to read, and I doubt
>> I am the only one.
> Oh, sorry about that.  I will avoid to add '-mregnames' to dump asm. :)
> BTW, what options are you used to dump asm code? 
>>
>> So you're doing
>>   rotldi 3,3,16 ; cmpldi 3,0x8000
>> instead of
>>   li 9,-1 ; rldicr 9,9,0,0 ; cmpd 3,9
>>
>>> +/* Check if C can be rotated from an immediate which starts (as 64bit 
>>> integer)
>>> +   with at least CLZ bits zero.
>>> +
>>> +   Return the number by which C can be rotated from the immediate.
>>> +   Return -1 if C can not be rotated as from.  */
>>> +
>>> +int
>>> +rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT c, int clz)
>>
>> The name does not say what the function does.  Can you think of a better
>> name?
>>
>> Maybe it is better to not return magic values anyway?  So perhaps
>>
>> bool
>> can_be_done_as_compare_of_rotate (unsigned HOST_WIDE_INT c, int clz, int 
>> *rot)
>>
>> (with *rot written if the return value is true).
> Thanks for your suggestion!
> It is checking if a constant can be rotated from/to a value which has
> only few tailing nonzero bits (all leading bits are zeros). 
>
> So, I'm thinking to name the function as something like:
> can_be_rotated_to_lowbits.
>
>>
>>> +  /* case c. xx10.0xx: rotate 'clz + 1' bits firstly, then check case 
>>> b.
>>
>> s/firstly/first/
> Thanks! 
>>
>>> +/* Check if C can be rotated from an immediate operand of cmpdi or cmpldi. 
>>>  */
>>> +
>>> +bool
>>> +compare_rotate_immediate_p (unsigned HOST_WIDE_INT c)
>>
>> No _p please, this function is not a predicate (at least, the name does
>> not say what it tests).  So a better name please.  This matters even
>> more for extern functions (like this one) because the function
>> implementation is always farther away so you do not easily have all
>> interface details in mind.  Good names help :-)
> Thanks! Name is always a matter. :)
>
> Maybe we can name this funciton as "can_be_rotated_as_compare_operand",
> or "is_constant_rotateable_for_compare", because this function checks
> "if a constant can be rotated to/from an immediate operand of
> cmpdi/cmpldi". 
>
>>
>>> +(define_code_iterator eqne [eq ne])
>>> +(define_code_attr EQNE [(eq "EQ") (ne "NE")])
>>
>> Just  or  should work?
> Great! Thanks for point out this!  works.
>>
>> Please fix these things.  Almost there :-)
>
> I updated the patch as below. Bootstraping and regtesting is ongoing.
> Thanks again for your careful and insight review!

Bootstrap and regtests pass on ppc64{,le}.

BR,
Jeff (Jiufu)

>
>
> BR,
> Jeff (Jiufu)
>
> --
> When checking eq/ne with a constant which has only 16bits, it can be
> optimized to check the rotated data.  By this, the constant building
> is optimized.
>
> As the example in PR103743:
> For "in == 0x8000LL", this patch generates:
> rotldi 3,3,1 ; cmpldi 0,3,1
> instead of:
> li 9,-1 ; rldicr 9,9,0,0 ; cmpd 0,3,9
>
> Compare with previous version:
> This patch refactor the code according to review comments.
> e.g. updating function names/comments/code.
>
>
>   PR target/103743
>
> gcc/ChangeLog:
>
>   * config/rs6000/rs6000-protos.h (can_be_rotated_to_lowbits): New.
>   (can_be_rotated_as_compare_operand): New.
>   * config/rs6000/rs6000.cc (can_be_rotated_to_lowbits): New definition.
>   (can_be_rotated_as_compare_operand): New definition.
>   * config/rs6000/rs6000.md (*rotate_on_cmpdi): New define_insn_and_split.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/powerpc/pr103743.c: New test.
>   * gcc.target/powerpc/pr103743_1.c: New test.
>
> ---
>  gcc/config/rs6000/rs6000-protos.h |  2 +
>  gcc/config/rs6000/rs6000.cc   | 56 +++
>  gcc/config/rs6000/rs6000.md   | 63 +++-
>  gcc/testsuite/gcc.target/powerpc/pr103743.c   | 52 ++
>  gcc/testsuite/gcc.target/powerpc/pr103743_1.c | 95 +++
>  5 files changed, 267 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743_1.c
>
> diff --git a/gcc/config/rs6000/rs6000-protos.h 
> b/gcc/config/rs6000/rs6000-protos.h
> index d0d89320ef6..9626917e359 100644
> --- a/gcc/config/rs6000/rs6000-protos.h
> +++ b/gcc/config/rs6000/rs6000-protos.h
> @@ -35,6 +35,8 @@ extern bool xx

Re: [PATCH V4 1/2] rs6000: use li;x?oris to build constant

2022-12-14 Thread Jiufu Guo via Gcc-patches
Hi,

"Kewen.Lin"  writes:

> Hi Jeff,
>
> on 2022/12/12 09:38, Jiufu Guo via Gcc-patches wrote:
>> Hi,
>> 
>> For constant C:
>> If '(c & 0x8000ULL) == 0x8000ULL' or say:
>> 32(1) || 16(x) || 1(1) || 15(x), using "li; xoris" would be ok.
>> 
>> If '(c & 0x80008000ULL) == 0x8000ULL' or say:
>> 32(0) || 1(1) || 15(x) || 1(0) || 15(x), we could use "li; oris" to
>> build constant 'C'.
>> 
>> Here N(M) means N continuous bit M, x for M means it is ok for either
>> 1 or 0; '||' means concatenation.
>> 
>> This patch update rs6000_emit_set_long_const to support those constants.
>> 
>> Compare with previous version, this patch fixes conflicts with trunk.
>> and put li;x?oris as the first patch (lis;xoris as the second patch).
>> Previous version:
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607618.html
>> 
>> Bootstrap and regtest pass on ppc64{,le}.
>> 
>> Is this ok for trunk?
>> 
>> BR,
>> Jeff (Jiufu)
>> 
>> 
>>  PR target/106708
>> 
>> gcc/ChangeLog:
>> 
>>  * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add using
>>  "li; x?oris" to build constant.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.target/powerpc/pr106708.c: New test.
>> 
>> ---
>>  gcc/config/rs6000/rs6000.cc | 36 +++---
>>  gcc/testsuite/gcc.target/powerpc/pr106708.c | 41 +
>>  2 files changed, 71 insertions(+), 6 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106708.c
>> 
>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>> index b3a609f3aa3..8c1192a10c8 100644
>> --- a/gcc/config/rs6000/rs6000.cc
>> +++ b/gcc/config/rs6000/rs6000.cc
>> @@ -10251,17 +10251,41 @@ rs6000_emit_set_long_const (rtx dest, 
>> HOST_WIDE_INT c)
>>if (ud1 != 0)
>>  emit_move_insn (dest, gen_rtx_IOR (DImode, temp, GEN_INT (ud1)));
>>  }
>> +  else if (ud4 == 0x && ud3 == 0x && (ud1 & 0x8000))
>> +{
>> +  /* li; xoris */
>> +  temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
>> +  emit_move_insn (temp, GEN_INT (sext_hwi (ud1, 16)));
>> +  emit_move_insn (dest, gen_rtx_XOR (DImode, temp,
>> + GEN_INT ((ud2 ^ 0x) << 16)));
>> +}
>>else if (ud3 == 0 && ud4 == 0)
>>  {
>>temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
>>  
>>gcc_assert (ud2 & 0x8000);
>> -  emit_move_insn (temp, GEN_INT (sext_hwi (ud2 << 16, 32)));
>> -  if (ud1 != 0)
>> -emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud1)));
>> -  emit_move_insn (dest,
>> -  gen_rtx_ZERO_EXTEND (DImode,
>> -   gen_lowpart (SImode,temp)));
>> +
>> +  if (ud1 == 0)
>> +{
>> +  /* lis; rldicl */
>> +  emit_move_insn (temp, GEN_INT (sext_hwi (ud2 << 16, 32)));
>> +  emit_move_insn (dest,
>> +  gen_rtx_AND (DImode, temp, GEN_INT (0x)));
>> +}
>> +  else if (!(ud1 & 0x8000))
>> +{
>> +  /* li; oris */
>> +  emit_move_insn (temp, GEN_INT (ud1));
>> +  emit_move_insn (dest,
>> +  gen_rtx_IOR (DImode, temp, GEN_INT (ud2 << 16)));
>> +}
>> +  else
>> +{
>
> Nit: Add "/* lis; ori; rldicl */" like the other arms?
>
>> +  emit_move_insn (temp, GEN_INT (sext_hwi (ud2 << 16, 32)));
>> +  emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud1)));
>> +  emit_move_insn (dest,
>> +  gen_rtx_AND (DImode, temp, GEN_INT (0x)));
>> +}
>>  }
>>else if (ud1 == ud3 && ud2 == ud4)
>>  {
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr106708.c 
>> b/gcc/testsuite/gcc.target/powerpc/pr106708.c
>> new file mode 100644
>> index 000..dc9ceda8367
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr106708.c
>> @@ -0,0 +1,41 @@
>> +/* PR target/106708 */
>> +/* { dg-do run } */
>> +/* { dg-options "-O2 -mno-prefixed -save-temps" } */
>> +/* { dg-require-effective-target has_arch_ppc64 } */
>> +
>> +long long arr[]
>> +  = {0x7cdeab55LL, 0x98765432LL, 0xabcdLL};
>> +
>> +void __attribute__ ((__noipa__)) lixoris (long long *arg)
>
> Nit: Adding separator "_" to make the name like "li_xoris" or even
> "test_li_xoris" seems better to read.  Also applied for the other
> function names "lioris" and "lisrldicl".
>
> The others look good to me.  Thanks!
>

Thanks a lot for your review and comments!
I will update into patch.

BR,
Jeff (Jiufu)

> BR,
> Kewen


Re: AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]

2022-12-14 Thread Pop, Sebastian via Gcc-patches
Hi Richard,


I will commit tomorrow the attached patches to the active branches gcc-10, 11, 
and 12.

The patches passed bootstrap and regression test on arm64-linux.


Sebastian


From: Richard Sandiford 
Sent: Thursday, December 8, 2022 1:38:07 AM
To: Pop, Sebastian
Cc: gcc-patches@gcc.gnu.org; seb...@gmail.com; Kyrylo Tkachov
Subject: RE: [EXTERNAL]AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]

CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.



"Pop, Sebastian"  writes:
> Hi Richard,
>
>
> Please find attached a patch that follows your recommendations to generate 
> the BTI_C instructions.
>
> Please let me know if the patch can be further improved.
>
> The patch passed bootstrap and regressions tests on arm64-linux.

LGTM.  OK for trunk, thanks, and for release branches after a grace period.

Richard

> Thanks,
>
> Sebastian
>
> 
> From: Richard Sandiford 
> Sent: Wednesday, December 7, 2022 3:12:08 AM
> To: Pop, Sebastian
> Cc: gcc-patches@gcc.gnu.org; seb...@gmail.com; Kyrylo Tkachov
> Subject: RE: [EXTERNAL]AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]
>
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you can confirm the sender and know the 
> content is safe.
>
>
>
> "Pop, Sebastian"  writes:
>> Thanks Richard for your review and for pointing out the issue with BTI.
>>
>>
>> The current patch removes the existing BTI instruction,
>>
>> and then adds the BTI hint when expanding the patchable_area pseudo.
>
> Thanks.  I still think...
>
>> The attached patch passed bootstrap and regression test on arm64-linux.
>>
>> Ok to commit to gcc trunk?
>>
>>
>> Thank you,
>> Sebastian
>>
>> 
>> From: Richard Sandiford 
>> Sent: Monday, December 5, 2022 5:34:40 AM
>> To: Pop, Sebastian
>> Cc: gcc-patches@gcc.gnu.org; seb...@gmail.com; Kyrylo Tkachov
>> Subject: RE: [EXTERNAL]AArch64: Add UNSPECV_PATCHABLE_AREA [PR98776]
>>
>> CAUTION: This email originated from outside of the organization. Do not 
>> click links or open attachments unless you can confirm the sender and know 
>> the content is safe.
>>
>>
>>
>> "Pop, Sebastian"  writes:
>>> Hi,
>>>
>>> Currently patchable area is at the wrong place on AArch64.  It is placed
>>> immediately after function label, before .cfi_startproc.  This patch
>>> adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and
>>> modifies aarch64_print_patchable_function_entry to avoid placing
>>> patchable area before .cfi_startproc.
>>>
>>> The patch passed bootstrap and regression test on aarch64-linux.
>>> Ok to commit to trunk and backport to active release branches?
>>
>> Looks good, but doesn't the problem described in the PR then still
>> apply to the BTI emitted by:
>>
>>   if (cfun->machine->label_is_assembled
>>   && aarch64_bti_enabled ()
>>   && !cgraph_node::get (cfun->decl)->only_called_directly_p ())
>> {
>>   /* Remove the BTI that follows the patch area and insert a new BTI
>>  before the patch area right after the function label.  */
>>   rtx_insn *insn = next_real_nondebug_insn (get_insns ());
>>   if (insn
>>   && INSN_P (insn)
>>   && GET_CODE (PATTERN (insn)) == UNSPEC_VOLATILE
>>   && XINT (PATTERN (insn), 1) == UNSPECV_BTI_C)
>> delete_insn (insn);
>>   asm_fprintf (file, "\thint\t34 // bti c\n");
>> }
>>
>> ?  It seems like the BTI will be before the cfi_startproc and the
>> patchable entry afterwards.
>>
>> I guess we should keep the BTI instruction as-is (rather than printing
>> a .hint) and emit the new UNSPECV_PATCHABLE_AREA after the BTI rather
>> than before it.
>
> ...this approach would be slightly cleaner though.  The .hint asm string
> we're emitting here is exactly the same as the one emiitted by the
> original bti_c instruction.  The only reason for deleting the
> instruction and emitting text was because we were emitting the
> patchable entry directly as text, and the BTI text had to come
> before the patchable entry text.
>
> Now that we're emitting the patchable entry via a normal instruction
> (a good thing!) we can keep the preceding bti_c as a normal instruction
> too.  That is, I think we should use emit_insn_after to emit the entry
> after the bti_c insn (if it exists) instead of before BB_HEAD.
>
> Thanks,
> Richard
>
>>> gcc/
>>> PR target/93492
>>> * config/aarch64/aarch64-protos.h (aarch64_output_patchable_area):
>>> Declared.
>>> * config/aarch64/aarch64.cc 
>>> (aarch64_print_patchable_function_entry):
>>> Emit an UNSPECV_PATCHABLE_AREA pseudo instruction.
>>> (aarch64_output_patchable_area): New.
>>> * config/aarch64/aarch64.md (UNSPECV_PATCHABLE_AREA): New.
>>> (patchable_area): Define.
>>>
>>> gcc/tes

[pushed] c++: fix initializer_list transformation [PR108071]

2022-12-14 Thread Jason Merrill via Gcc-patches
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

In these testcases, we weren't adequately verifying that constructing the
element type from an array element would have the same effect as
constructing it from one of the initializers.

PR c++/108071
PR c++/105838

gcc/cp/ChangeLog:

* call.cc (struct conversion_obstack_sentinel): New.
(maybe_init_list_as_array): Compare conversion of dummy argument.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist131.C: New test.
* g++.dg/cpp0x/initlist132.C: New test.
* g++.dg/cpp0x/initlist133.C: New test.
---
 gcc/cp/call.cc   | 35 
 gcc/testsuite/g++.dg/cpp0x/initlist131.C | 14 ++
 gcc/testsuite/g++.dg/cpp0x/initlist132.C | 30 
 gcc/testsuite/g++.dg/cpp0x/initlist133.C | 25 +
 4 files changed, 98 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist131.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist132.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist133.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 33b5e7f87f5..c25df174280 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -622,6 +622,15 @@ conversion_obstack_alloc (size_t n)
   return p;
 }
 
+/* RAII class to discard anything added to conversion_obstack.  */
+
+struct conversion_obstack_sentinel
+{
+  void *p;
+  conversion_obstack_sentinel (): p (conversion_obstack_alloc (0)) {}
+  ~conversion_obstack_sentinel () { obstack_free (&conversion_obstack, p); }
+};
+
 /* Allocate rejection reasons.  */
 
 static struct rejection_reason *
@@ -4219,18 +4228,32 @@ static tree
 maybe_init_list_as_array (tree elttype, tree init)
 {
   /* Only do this if the array can go in rodata but not once converted.  */
-  if (!CLASS_TYPE_P (elttype))
+  if (!TYPE_NON_AGGREGATE_CLASS (elttype))
 return NULL_TREE;
   tree init_elttype = braced_init_element_type (init);
   if (!init_elttype || !SCALAR_TYPE_P (init_elttype) || !TREE_CONSTANT (init))
 return NULL_TREE;
 
+  /* Check with a stub expression to weed out special cases, and check whether
+ we call the same function for direct-init as copy-list-init.  */
+  conversion_obstack_sentinel cos;
+  tree arg = build_stub_object (init_elttype);
+  conversion *c = implicit_conversion (elttype, init_elttype, arg, false,
+  LOOKUP_NORMAL, tf_none);
+  if (c && c->kind == ck_rvalue)
+c = next_conversion (c);
+  if (!c || c->kind != ck_user)
+return NULL_TREE;
+
   tree first = CONSTRUCTOR_ELT (init, 0)->value;
-  if (TREE_CODE (init_elttype) == INTEGER_TYPE && null_ptr_cst_p (first))
-/* Avoid confusion from treating 0 as a null pointer constant.  */
-first = build1 (UNARY_PLUS_EXPR, init_elttype, first);
-  first = (perform_implicit_conversion_flags
-  (elttype, first, tf_none, LOOKUP_IMPLICIT|LOOKUP_NO_NARROWING));
+  conversion *fc = implicit_conversion (elttype, init_elttype, first, false,
+   LOOKUP_IMPLICIT|LOOKUP_NO_NARROWING,
+   tf_none);
+  if (fc && fc->kind == ck_rvalue)
+fc = next_conversion (fc);
+  if (!fc || fc->kind != ck_user || fc->cand->fn != c->cand->fn)
+return NULL_TREE;
+  first = convert_like (fc, first, tf_none);
   if (first == error_mark_node)
 /* Let the normal code give the error.  */
 return NULL_TREE;
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist131.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist131.C
new file mode 100644
index 000..a714215219a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist131.C
@@ -0,0 +1,14 @@
+// PR c++/108071
+// { dg-do compile { target c++11 } }
+
+#include 
+
+struct OptSpecifier {
+  explicit OptSpecifier(bool);
+  OptSpecifier(unsigned);
+};
+void f (std::initializer_list);
+int main()
+{
+  f({1});
+}
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist132.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist132.C
new file mode 100644
index 000..34e0307cbbc
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist132.C
@@ -0,0 +1,30 @@
+// PR c++/108071
+// { dg-do compile { target c++11 } }
+
+#include 
+
+template< typename T1, typename T2 = void >
+struct ConstCharArrayDetector
+{
+static const bool ok = false;
+};
+template< std::size_t N, typename T >
+struct ConstCharArrayDetector< const char[ N ], T >
+{
+typedef T Type;
+};
+
+struct Dummy { };
+
+struct OUString
+{
+  template
+OUString(T&, typename ConstCharArrayDetector::Type = Dummy())
+{ }
+};
+
+struct Sequence {
+  Sequence(std::initializer_list);
+};
+
+Sequence s = {""};
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist133.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist133.C
new file mode 100644
index 000..08da5bebd0b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist133.C
@@ -0,0 +1,25 @@
+// PR c++/108071
+// { dg-do compile { target c++14 } }
+
+#include 
+
+template struct

[PATCH V2 1/2] x86: Don't add crtfastmath.o for -shared

2022-12-14 Thread liuhongt via Gcc-patches
Update in V2:
Split -shared change into a separate commit and add some documentation
for it.
Bootstrapped and regtested on x86_64-pc-linu-gnu{-m32,}.
Ok of trunk?

Don't add crtfastmath.o for -shared to avoid changing the MXCSR register
when loading a shared library.  crtfastmath.o will be used only when
building executables.

 PR target/55522
* config/i386/gnu-user-common.h (GNU_USER_TARGET_MATHFILE_SPEC):
Don't add crtfastmath.o for -shared.
* doc/invoke.texi (-shared): Add related documentation.
---
 gcc/config/i386/gnu-user-common.h | 2 +-
 gcc/doc/invoke.texi   | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/gnu-user-common.h 
b/gcc/config/i386/gnu-user-common.h
index cab9be2bfb7..9910cd64363 100644
--- a/gcc/config/i386/gnu-user-common.h
+++ b/gcc/config/i386/gnu-user-common.h
@@ -47,7 +47,7 @@ along with GCC; see the file COPYING3.  If not see
 
 /* Similar to standard GNU userspace, but adding -ffast-math support.  */
 #define GNU_USER_TARGET_MATHFILE_SPEC \
-  "%{Ofast|ffast-math|funsafe-math-optimizations:crtfastmath.o%s} \
+  "%{Ofast|ffast-math|funsafe-math-optimizations:%{!shared:crtfastmath.o%s}} \
%{mpc32:crtprec32.o%s} \
%{mpc64:crtprec64.o%s} \
%{mpc80:crtprec80.o%s}"
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cb40b38b73a..cba4f19f4f4 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -17656,7 +17656,8 @@ needs to build supplementary stub code for constructors 
to work.  On
 multi-libbed systems, @samp{gcc -shared} must select the correct support
 libraries to link against.  Failing to supply the correct flags may lead
 to subtle defects.  Supplying them in cases where they are not necessary
-is innocuous.}
+is innocuous. For x86, crtfastmath.o will not be added when
+@option{-shared} is specified. }
 
 @item -shared-libgcc
 @itemx -static-libgcc
-- 
2.27.0



[PATCH V2 2/2] [x86] x86: Add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-14 Thread liuhongt via Gcc-patches
Update in v2:
1. Support -mno-daz-ftz, and make the the option effectively three state as:

if (mdaz-ftz)
  link crtfastmath.o
else if ((Ofast || ffast-math || funsafe-math-optimizations)
 && !shared && !mno-daz-ftz)
  link crtfastmath.o
else
  Don't link crtfastmath.o

2. Still make the option Target since
   a. cc1: error: command-line option ‘-mdaz-ftz’ is valid for the driver but 
not for C
   b. Since there's no real variable speicified by mdaz-ftz, I saw in 
options.h, it's marked as
   #ifndef GENERATOR_FILE
  int x_VAR_mdaz_ftz;
  #define x_VAR_mdaz_ftz do_not_use
  #endif

and not be saved and restored in cl_target_option_save and 
cl_target_option_restore(am I missing something?)

3. Capital the first letter and add more descriptions about -mdaz-ftz and 
-shared.

gcc/ChangeLog:

PR target/55522
PR target/36821
* config/i386/gnu-user-common.h (GNU_USER_TARGET_MATHFILE_SPEC):
Link crtfastmath.o whenever -mdaz-ftz is specified. Don't link
crtfastmath.o when -share or -mno-daz-ftz is specified.
* config/i386/i386.opt (mdaz-ftz): New option.
* doc/invoke.texi (x86 options): Document mftz-daz.
---
 gcc/config/i386/gnu-user-common.h |  2 +-
 gcc/config/i386/i386.opt  |  4 
 gcc/doc/invoke.texi   | 12 +++-
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/gnu-user-common.h 
b/gcc/config/i386/gnu-user-common.h
index 9910cd64363..f910524a6c3 100644
--- a/gcc/config/i386/gnu-user-common.h
+++ b/gcc/config/i386/gnu-user-common.h
@@ -47,7 +47,7 @@ along with GCC; see the file COPYING3.  If not see
 
 /* Similar to standard GNU userspace, but adding -ffast-math support.  */
 #define GNU_USER_TARGET_MATHFILE_SPEC \
-  "%{Ofast|ffast-math|funsafe-math-optimizations:%{!shared:crtfastmath.o%s}} \
+  
"%{mdaz-ftz:crtfastmath.o%s;Ofast|ffast-math|funsafe-math-optimizations:%{!shared:%{!mno-daz-ftz:crtfastmath.o%s}}}
 \
%{mpc32:crtprec32.o%s} \
%{mpc64:crtprec64.o%s} \
%{mpc80:crtprec80.o%s}"
diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
index fb4e57ada7c..0b7df429734 100644
--- a/gcc/config/i386/i386.opt
+++ b/gcc/config/i386/i386.opt
@@ -420,6 +420,10 @@ mpc80
 Target RejectNegative
 Set 80387 floating-point precision to 80-bit.
 
+mdaz-ftz
+Target
+Set the FTZ and DAZ Flags.
+
 mpreferred-stack-boundary=
 Target RejectNegative Joined UInteger Var(ix86_preferred_stack_boundary_arg)
 Attempt to keep stack aligned to this power of 2.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index cba4f19f4f4..7f1d002f228 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1433,7 +1433,7 @@ See RS/6000 and PowerPC Options.
 -m96bit-long-double  -mlong-double-64  -mlong-double-80  -mlong-double-128 @gol
 -mregparm=@var{num}  -msseregparm @gol
 -mveclibabi=@var{type}  -mvect8-ret-in-mem @gol
--mpc32  -mpc64  -mpc80  -mstackrealign @gol
+-mpc32  -mpc64  -mpc80  -mdaz-ftz -mstackrealign @gol
 -momit-leaf-frame-pointer  -mno-red-zone  -mno-tls-direct-seg-refs @gol
 -mcmodel=@var{code-model}  -mabi=@var{name}  -maddress-mode=@var{mode} @gol
 -m32  -m64  -mx32  -m16  -miamcu  -mlarge-data-threshold=@var{num} @gol
@@ -32753,6 +32753,16 @@ are enabled by default; routines in such libraries 
could suffer significant
 loss of accuracy, typically through so-called ``catastrophic cancellation'',
 when this option is used to set the precision to less than extended precision.
 
+@item -mdaz-ftz
+@opindex mdaz-ftz
+
+The flush-to-zero (FTZ) and denormals-are-zero (DAZ) flags in the MXCSR 
register
+are used to control floating-point calculations.SSE and AVX instructions
+including scalar and vector instructions could benefit from enabling the FTZ
+and DAZ flags when @option{-mdaz-ftz} is specified. Don't set FTZ/DAZ flags
+when @option{-mno-daz-ftz} or @option{-shared} is specified, @option{-mdaz-ftz}
+will set FTZ/DAZ flags even with @option{-shared}.
+
 @item -mstackrealign
 @opindex mstackrealign
 Realign the stack at entry.  On the x86, the @option{-mstackrealign}
-- 
2.27.0



Re: [PATCH] libgccjit: Fix a failing test

2022-12-14 Thread Guillaume Gomez via Gcc-patches
Forgot it indeed, thanks for notifying me!

I modified the commit message to add it and added it into this email.

Le mer. 14 déc. 2022 à 16:12, Antoni Boucher  a écrit :

> Thanks!
>
> In your patch, you're missing this line at the end of the commit
> message:
>
>Signed-off-by: Guillaume Gomez 
>
> On Wed, 2022-12-14 at 14:39 +0100, Guillaume Gomez via Jit wrote:
> > Hi,
> >
> > This fixes bug 107999.
> >
> > Thanks in advance for the review.
>
>
From 30f049d4f39de392dbb3cff4b64779f2507fc914 Mon Sep 17 00:00:00 2001
From: Guillaume Gomez 
Date: Wed, 14 Dec 2022 14:28:22 +0100
Subject: [PATCH] Fix a failing test by updating its error string.

gcc/testsuite/ChangeLog:

	* jit.dg/test-error-array-bounds.c: Update test.

Signed-off-by: Guillaume Gomez 
---
 gcc/testsuite/jit.dg/test-error-array-bounds.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/jit.dg/test-error-array-bounds.c b/gcc/testsuite/jit.dg/test-error-array-bounds.c
index b6c0ee526d4..a0dead13cb7 100644
--- a/gcc/testsuite/jit.dg/test-error-array-bounds.c
+++ b/gcc/testsuite/jit.dg/test-error-array-bounds.c
@@ -70,5 +70,5 @@ verify_code (gcc_jit_context *ctxt, gcc_jit_result *result)
   /* ...and that the message was captured by the API.  */
   CHECK_STRING_VALUE (gcc_jit_context_get_first_error (ctxt),
 		  "array subscript 10 is above array bounds of"
-		  " 'char[10]' [-Warray-bounds]");
+		  " 'char[10]' [-Warray-bounds=]");
 }
-- 
2.34.1



Re: [PATCH V2 2/2] [x86] x86: Add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-14 Thread Jakub Jelinek via Gcc-patches
On Thu, Dec 15, 2022 at 02:21:37PM +0800, liuhongt via Gcc-patches wrote:
> --- a/gcc/config/i386/i386.opt
> +++ b/gcc/config/i386/i386.opt
> @@ -420,6 +420,10 @@ mpc80
>  Target RejectNegative
>  Set 80387 floating-point precision to 80-bit.
>  
> +mdaz-ftz
> +Target

s/Target/Driver/

> +Set the FTZ and DAZ Flags.

Jakub



Re: [PATCH V2 2/2] [x86] x86: Add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-14 Thread Hongtao Liu via Gcc-patches
On Thu, Dec 15, 2022 at 3:39 PM Jakub Jelinek  wrote:
>
> On Thu, Dec 15, 2022 at 02:21:37PM +0800, liuhongt via Gcc-patches wrote:
> > --- a/gcc/config/i386/i386.opt
> > +++ b/gcc/config/i386/i386.opt
> > @@ -420,6 +420,10 @@ mpc80
> >  Target RejectNegative
> >  Set 80387 floating-point precision to 80-bit.
> >
> > +mdaz-ftz
> > +Target
>
> s/Target/Driver/
Change to Driver and Got error like:cc1: error: command-line option
‘-mdaz-ftz’ is valid for the driver but not for C.
>
> > +Set the FTZ and DAZ Flags.
>
> Jakub
>


-- 
BR,
Hongtao


Re: [PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299

2022-12-14 Thread Kewen.Lin via Gcc-patches
>> I bet the above workaround in generic code was added for a reason, it would
>> surprise me if _Float128 worked at all without that hack.
> 
> OK, I'll have a look at those nan failures soon.

By investigating the exposed NaN failures, I found it's due to that it wants
to convert _Float128 type constant to long double type constant, it goes
through function real_convert which clears the signalling bit in the context
of !HONOR_SNANS (arg).

  if (r->cl == rvc_nan)
r->signalling = 0;

The test cases don't have the explicit option -fsignaling-nans, I'm inclined
to believe it's intentional since there is only a sNaN generation.  If so,
we don't want this kind of conversion which is useless and can clear signalling
bit unexpectedly, one shortcut is to just copy the corresponding REAL_VALUE_TYPE
and rebuild with the given type if the modes are the same.

-
diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index e80be8049e1..d036b09dc6f 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -2178,6 +2178,14 @@ fold_convert_const_real_from_real (tree type, const_tree 
arg1)
   REAL_VALUE_TYPE value;
   tree t;

+  /* If the underlying modes are the same, just copy the
+ TREE_REAL_CST information and rebuild with the given type.  */
+  if (TYPE_MODE (type) == TYPE_MODE (TREE_TYPE (arg1)))
+{
+  t = build_real (type, TREE_REAL_CST (arg1));
+  return t;
+}
+
   /* Don't perform the operation if flag_signaling_nans is on
  and the operand is a signaling NaN.  */
   if (HONOR_SNANS (arg1)

-

The above diff can fix all exposed NaN failures.

BR,
Kewen


Re: Ping---[V3][PATCH 2/2] Add a new warning option -Wstrict-flex-arrays.

2022-12-14 Thread Richard Biener via Gcc-patches
On Wed, 14 Dec 2022, Qing Zhao wrote:

> Hi, Richard,
> 
> I guess that we now agreed on the following:
> 
> “ the information that we ran into a trailing array but didn't consider 
> it a flex array because of -fstrict-flex-arrays is always a useful 
> information”
> 
> The only thing we didn’t decide is:
> 
> A. Amend such new information to -Warray-bounds when -fstrict-flex-arrays=N 
> (N>0) specified.
> 
> OR
> 
> B. Issue such new information with a new warning option -Wstrict-flex-arrays 
> when -fstrict-flex-arrays=N (N>0) specified.
> 
> My current patch implemented B. 

Plus it implements it to specify a different flex-array variant for
the extra diagnostic.

> If you think A is better, I will change the patch as A. 

I would tend to A since, as I said, it's useful information that
shouldn't be hidden and not adding an option removes odd combination
possibilities such as -Wno-array-bounds -Wstrict-flex-arrays.
In particular I find, say, -fstrict-flex-arrays=2 -Wstrict-flex-arrays=1
hardly useful.

But I'm interested in other opinions.

Thanks,
Richard.

> Let me know your opinion.
> 
> thanks.
> 
> Qing
> 
> 
> > On Dec 14, 2022, at 4:03 AM, Richard Biener  wrote:
> > 
> > On Tue, 13 Dec 2022, Qing Zhao wrote:
> > 
> >> Richard, 
> >> 
> >> Do you have any decision on this one? 
> >> Do we need this warning option For GCC? 
> > 
> > Looking at the testcases it seems that the diagnostic amends
> > -Warray-bounds diagnostics for trailing but not flexible arrays?
> > Wouldn't it be better to generally diagnose this, so have
> > -Warray-bounds, with -fstrict-flex-arrays, for
> > 
> > struct X { int a[1]; };
> > int foo (struct X *p)
> > {
> >  return p->a[1];
> > }
> > 
> > emit
> > 
> > warning: array subscript 1 is above array bounds ...
> > note: the trailing array is only a flexible array member with 
> > -fno-strict-flex-arrays
> > 
> > ?  Having -Wstrict-flex-arrays=N and N not agree with the
> > -fstrict-flex-arrays level sounds hardly useful to me but the
> > information that we ran into a trailing array but didn't consider
> > it a flex array because of -fstrict-flex-arrays is always a
> > useful information?
> > 
> > But maybe I misunderstood this new diagnostic?
> > 
> > Thanks,
> > Richard.
> > 
> > 
> >> thanks.
> >> 
> >> Qing
> >> 
> >>> On Dec 6, 2022, at 11:18 AM, Qing Zhao  wrote:
> >>> 
> >>> '-Wstrict-flex-arrays'
> >>>Warn about inproper usages of flexible array members according to
> >>>the LEVEL of the 'strict_flex_array (LEVEL)' attribute attached to
> >>>the trailing array field of a structure if it's available,
> >>>otherwise according to the LEVEL of the option
> >>>'-fstrict-flex-arrays=LEVEL'.
> >>> 
> >>>This option is effective only when LEVEL is bigger than 0.
> >>>Otherwise, it will be ignored with a warning.
> >>> 
> >>>when LEVEL=1, warnings will be issued for a trailing array
> >>>reference of a structure that have 2 or more elements if the
> >>>trailing array is referenced as a flexible array member.
> >>> 
> >>>when LEVEL=2, in addition to LEVEL=1, additional warnings will be
> >>>issued for a trailing one-element array reference of a structure if
> >>>the array is referenced as a flexible array member.
> >>> 
> >>>when LEVEL=3, in addition to LEVEL=2, additional warnings will be
> >>>issued for a trailing zero-length array reference of a structure if
> >>>the array is referenced as a flexible array member.
> >>> 
> >>> gcc/ChangeLog:
> >>> 
> >>>   * doc/invoke.texi: Document -Wstrict-flex-arrays option.
> >>>   * gimple-array-bounds.cc (check_out_of_bounds_and_warn): Add two more
> >>>   arguments.
> >>>   (array_bounds_checker::check_array_ref): Issue warnings for
> >>>   -Wstrict-flex-arrays.
> >>>   * opts.cc (finish_options): Issue warning for unsupported combination
> >>>   of -Wstrict_flex_arrays and -fstrict-flex-array.
> >>>   * tree-vrp.cc (execute_ranger_vrp): Enable the pass when
> >>>   warn_strict_flex_array is true.
> >>> 
> >>> gcc/c-family/ChangeLog:
> >>> 
> >>>   * c.opt (Wstrict-flex-arrays): New option.
> >>> 
> >>> gcc/testsuite/ChangeLog:
> >>> 
> >>>   * gcc.dg/Warray-bounds-flex-arrays-1.c: Update testing case with
> >>>   -Wstrict-flex-arrays.
> >>>   * gcc.dg/Warray-bounds-flex-arrays-2.c: Likewise.
> >>>   * gcc.dg/Warray-bounds-flex-arrays-3.c: Likewise.
> >>>   * gcc.dg/Warray-bounds-flex-arrays-4.c: Likewise.
> >>>   * gcc.dg/Warray-bounds-flex-arrays-5.c: Likewise.
> >>>   * gcc.dg/Warray-bounds-flex-arrays-6.c: Likewise.
> >>>   * c-c++-common/Wstrict-flex-arrays.c: New test.
> >>>   * gcc.dg/Wstrict-flex-arrays-2.c: New test.
> >>>   * gcc.dg/Wstrict-flex-arrays-3.c: New test.
> >>>   * gcc.dg/Wstrict-flex-arrays.c: New test.
> >>> ---
> >>> gcc/c-family/c.opt|   5 +
> >>> gcc/doc/invoke.texi   |  27 -
> >>> gcc/gimple-array-bounds.cc| 103 ++
> >>> gcc/opts.cc 

[PATCH] into-ssa: Fix emitting debug stmts after asm goto [PR108095]

2022-12-14 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs, because ccp1 replaced
  s.0_1 = &s;
  __asm__ goto("" : "=r" MEM[(T *)s.0_1] :  :  : "lab" lab);
with
  __asm__ goto("" : "=r" s :  :  : "lab" lab);
and because s is no longer addressable, we are rewriting it into
ssa and want
  __asm__ goto("" : "=r" s_7 :  :  : "lab" lab);
plus debug stmt
  # DEBUG s => s_7
The code assumes that there is at most one non-EH edge in that
case, but with the addition of outputs to asm goto that is no longer the
case, we can have many outgoing edges.

The patch keeps the checking assertion that there is at most one such
edge for everything but asm goto, but moves the addition of the debug
stmt into the loop, so that it can be added on all edges where it is
possible, not just one of them.

Furthermore, looking at gsi_insert_on_edge_immediate
-> gimple_find_edge_insert_loc, the conditions to insert stmt there
to the destination block are
  if (single_pred_p (dest)
  && gimple_seq_empty_p (phi_nodes (dest))
  && dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
(plus there is code to insert it in the previous block but that is
never true when the pred is known to be stmt_ends_bb_p), while
mayube_register_def was just checking
 if (ef && single_pred_p (ef->dest)
 && ef->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
so if for whatever reason ef->dest had any PHIs, we'd split the
edge for -g and not for -g0, something we must avoid for -fcompare-debug
stability.  So, I've added the no phi_nodes check too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-12-15  Jakub Jelinek  

PR tree-optimization/108095
* tree-into-ssa.cc (maybe_register_def): Insert debug stmt
on all non-EH edges from asm goto if they have a single
predecessor rather than asserting there is at most one such edge.
Test whether there are no PHI nodes next to the single predecessor
test.

* gcc.dg/pr108095.c: New test.

--- gcc/tree-into-ssa.cc.jj 2022-12-05 23:20:24.0 +0100
+++ gcc/tree-into-ssa.cc2022-12-14 14:20:50.301283097 +0100
@@ -1934,9 +1934,8 @@ maybe_register_def (def_operand_p def_p,
  tree tracked_var = target_for_debug_bind (sym);
  if (tracked_var)
{
- gimple *note = gimple_build_debug_bind (tracked_var, def, stmt);
- /* If stmt ends the bb, insert the debug stmt on the single
-non-EH edge from the stmt.  */
+ /* If stmt ends the bb, insert the debug stmt on the non-EH
+edge(s) from the stmt.  */
  if (gsi_one_before_end_p (gsi) && stmt_ends_bb_p (stmt))
{
  basic_block bb = gsi_bb (gsi);
@@ -1945,33 +1944,46 @@ maybe_register_def (def_operand_p def_p,
  FOR_EACH_EDGE (e, ei, bb->succs)
if (!(e->flags & EDGE_EH))
  {
-   gcc_checking_assert (!ef);
+   /* asm goto can have multiple non-EH edges from the
+  stmt.  Insert on all of them where it is
+  possible.  */
+   gcc_checking_assert (!ef || (gimple_code (stmt)
+== GIMPLE_ASM));
ef = e;
- }
- /* If there are other predecessors to ef->dest, then
-there must be PHI nodes for the modified
-variable, and therefore there will be debug bind
-stmts after the PHI nodes.  The debug bind notes
-we'd insert would force the creation of a new
-block (diverging codegen) and be redundant with
-the post-PHI bind stmts, so don't add them.
+   /* If there are other predecessors to ef->dest, then
+  there must be PHI nodes for the modified
+  variable, and therefore there will be debug bind
+  stmts after the PHI nodes.  The debug bind notes
+  we'd insert would force the creation of a new
+  block (diverging codegen) and be redundant with
+  the post-PHI bind stmts, so don't add them.
 
-As for the exit edge, there wouldn't be redundant
-bind stmts, but there wouldn't be a PC to bind
-them to either, so avoid diverging the CFG.  */
- if (ef && single_pred_p (ef->dest)
- && ef->dest != EXIT_BLOCK_PTR_FOR_FN (cfun))
-   {
- /* If there were PHI nodes in the node, we'd
-have to make sure the value we're binding
-doesn't need rewriting.  But there shouldn't
-be PHI nodes in a single-predecessor block,
-

Re: Make '-frust-incomplete-and-experimental-compiler-do-not-use' a 'Common' option (was: Rust front-end patches v4)

2022-12-14 Thread Richard Biener via Gcc-patches
On Wed, Dec 14, 2022 at 11:58 PM Thomas Schwinge
 wrote:
>
> Hi!
>
> On 2022-12-13T14:40:36+0100, Arthur Cohen  wrote:
> > We've also added one more commit, which only affects files inside the
> > Rust front-end folder. This commit adds an experimental flag, which
> > blocks the compilation of Rust code when not used.
>
> (That's commit r13-4675-gb07ef39ffbf4e77a586605019c64e2e070915ac3
> "gccrs: Add fatal_error when experimental flag is not present".)
>
> I noticed that GCC/Rust recently lost all LTO variants in torture
> testing -- due to this commit.  :-O
>
> OK to push the attached
> "Make '-frust-incomplete-and-experimental-compiler-do-not-use' a 'Common' 
> option",
> or should this be done differently?

Just add 'LTO' to the option in lang.opt, like

frust-incomplete-and-experimental-compiler-do-not-use
Rust LTO Var(flag_rust_experimental)
Enable experimental compilation of Rust files at your own risk


>
> With that, we get back:
>
>  PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O0  (test 
> for excess errors)
>  PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O1  (test 
> for excess errors)
>  PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2  (test 
> for excess errors)
> +PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto 
> -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
> +PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O2 -flto 
> -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
>  PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -O3 -g  
> (test for excess errors)
>  PASS: rust/compile/torture/all_doc_comment_line_blocks.rs   -Os  (test 
> for excess errors)
>
> Etc., and in total:
>
> === rust Summary for unix ===
>
> # of expected passes[-4990-]{+6718+}
> # of expected failures  [-39-]{+51+}
>
>
> Grüße
>  Thomas
>
>
> > We hope this helps
> > indicate to users that the compiler is not yet ready, but can still be
> > experimented with :)
> >
> > We plan on removing that flag as soon as possible, but in the meantime,
> > we think it will help not creating divide within the Rust ecosystem, as
> > well as not waste Rust crate maintainers' time.
>
>
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955


Re: [PATCH 2/3] Make __float128 use the _Float128 type, PR target/107299

2022-12-14 Thread Kewen.Lin via Gcc-patches
on 2022/12/14 18:33, Jakub Jelinek wrote:
> On Wed, Dec 14, 2022 at 06:11:26PM +0800, Kewen.Lin wrote:
>>> The hacks with different precisions of powerpc 128-bit floating types are
>>> very unfortunate, it is I assume because the middle-end asserted that scalar
>>> floating point types with different modes have different precision.
>>> We no longer assert that, as BFmode and HFmode (__bf16 and _Float16) have
>>> the same 16-bit precision as well and e.g. C++ FE knows to treat standard
>>> vs. extended floating point types vs. other unknown floating point types
>>> differently in finding result type of binary operations or in which type
>>> comparisons will be done.  
>>
>> It's good news, for now those three long double modes on Power have different
>> precisions, if they can have the same precision, I'd expect the ICE should be
>> gone.
> 
> I'm talking mainly about r13-3292, the asserts now check different modes
> have different precision unless it is half vs. brain or vice versa, but
> could be changed further, but if the precision is the same, the FEs
> and the middle-end needs to know how to deal with those.
> For C++23, say when __ibm128 is the same as long double and _Float128 is
> supported, the 2 types are unordered (neither is a subset or superset of
> the other because there are many _Float128 values one can't represent
> in double double (whether anything with exponent larger than what double
> can represent or most of the more precise values), but because of the
> variable precision there are double double values that can't be represented
> in _Float128 either) and so we can error on comparisons of those or on
> arithmetics with such arguments (unless explicitly converted first).
> But for backwards compatibility we can't do that for __float128 vs. __ibm128
> and so need to backwards compatibly decide what wins.  And for the
> middle-end say even for mode conversions decide what is widening and what is
> narrowing even when they are unordered.

Thanks for the pointer!  I don't have good understanding on the backwards
compatibility on those conversions, guessing Mike, Segher and David would have
more insights.

BR,
Kewen