Re: [v4 PATCH 4/4] RISC-V: Add zhinx/zhinxmin testcases.

2022-10-30 Thread Andreas Schwab
On Okt 20 2022, jiawei wrote:

> diff --git a/gcc/testsuite/gcc.target/riscv/_Float16-zhinx-1.c 
> b/gcc/testsuite/gcc.target/riscv/_Float16-zhinx-1.c
> new file mode 100644
> index 000..90172b57e05
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/_Float16-zhinx-1.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64i_zhinx -mabi=lp64 -O" } */
> +
> +_Float16 foo1 (_Float16 a, _Float16 b)
> +{
> +return b;
> +}
> +
> +/* { dg-final { scan-assembler-not "fmv.h" } } */
> +/* { dg-final { scan-assembler-times "mv" 1 } } */

This fails with -flto (mv is found twice).

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH] RISC-V: Support load/store in mov pattern for RVV modes.

2022-10-30 Thread Andreas Schwab
On Okt 24 2022, juzhe.zh...@rivai.ai wrote:

>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-10.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-11.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-12.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-13.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-6.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-7.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-8.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-9.c

They all fail if the ilp32d ABI is not available.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH] RISC-V: Add RVV vsetvl/vsetvlmax intrinsics and tests.

2022-10-30 Thread Andreas Schwab
On Okt 17 2022, juzhe.zh...@rivai.ai wrote:

> gcc/testsuite/ChangeLog:
>
>   * gcc.target/riscv/rvv/base/vsetvl-1.c: New test.

This fails if the ilp32d ABI is not available.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH Rust front-end v3 01/46] Use DW_ATE_UTF for the Rust 'char' type

2022-10-30 Thread Mark Wielaard
Hi,

On Wed, Oct 26, 2022 at 10:39:09AM +0200, Jakub Jelinek wrote:
> I must say I don't understand nor like this DW_LANG_Rust_old stuff at all.
> Other languages don't do similar dances.
> Look for D, or Go.  Neither of them has any non-standard lang code as
> fallback, they use the DWARF assigned DW_LANG_* code, and DW_LANG_C as
> fallback.  On most arches, DWARF 5 is the default anyway, or non-strict
> DWARF at least.  Where neither is enabled because of prehistoric or buggy
> DWARF consumers, it is unlikely they'd handle Rust sanely anyway.
> Just follow what Go does in the same function.

DW_LANG_Rust_old was used by old rustc compilers <= 2016 before DWARF5
assigned an official number. It might be recognized by some
debuggers. But I agree that these days it doesn't really make sense to
emit it. When producing strict DWARF it is also slightly odd to emit a
non-standard language code. So I agree that it makes sense to do what
Go does, always emit DW_LANG_Rust unless we emit strict DWARF for
versions before 5 (and then just fall back to DW_LANG_C).

The attached patch (against "upstream gccrs") does that. I kept the
oldlang.rs testcase just to see that the -gstrict-dwarf -gdwarf-3 case
does something sane.

The only "issue" is that is_rust () depends on the comp_unit_die
DW_AT_language being DW_LANG_Rust. But the only usage of is_rust
already depends on strict DWARF.

https://code.wildebeest.org/git/user/mjw/gccrs/commit/?h=no-Rust-old
if someone wants to push that, to merge for a v4.

Thanks,

Mark>From cdcfe27cfba23402f91200c64c1ef8e0bf3528a0 Mon Sep 17 00:00:00 2001
From: Mark Wielaard 
Date: Sun, 30 Oct 2022 16:03:16 +0100
Subject: [PATCH] dwarf2out.c: Don't emit DW_LANG_Rust_old

DW_LANG_Rust_old is a non-standard DWARF language code used by old
rustc compilers before DWARF5 (released in 2017). Just always emit
DW_LANG_Rust unless producing strict DWARF for versions before 5.
And in that old strict DWARF case just emit DW_LANG_C instead of a
non-standard language code.
---
 gcc/dwarf2out.cc| 14 +-
 gcc/testsuite/rust/debug/oldlang.rs |  4 ++--
 2 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
index 7b9d5ae33fc..87c0d103a27 100644
--- a/gcc/dwarf2out.cc
+++ b/gcc/dwarf2out.cc
@@ -5600,14 +5600,15 @@ is_fortran (const_tree decl)
   return is_fortran ();
 }
 
-/* Return TRUE if the language is Rust.  */
+/* Return TRUE if the language is Rust.
+   Note, returns FALSE for dwarf_version < 5 && dwarf_strict. */
 
 static inline bool
 is_rust ()
 {
   unsigned int lang = get_AT_unsigned (comp_unit_die (), DW_AT_language);
 
-  return lang == DW_LANG_Rust || lang == DW_LANG_Rust_old;
+  return lang == DW_LANG_Rust;
 }
 
 /* Return TRUE if the language is Ada.  */
@@ -25216,13 +25217,6 @@ gen_compile_unit_die (const char *filename)
 }
   else if (strcmp (language_string, "GNU F77") == 0)
 language = DW_LANG_Fortran77;
-  else if (strcmp (language_string, "GNU Rust") == 0)
-{
-  if (dwarf_version >= 5 || !dwarf_strict)
-	language = DW_LANG_Rust;
-  else
-	language = DW_LANG_Rust_old;
-}
   else if (dwarf_version >= 3 || !dwarf_strict)
 {
   if (strcmp (language_string, "GNU Ada") == 0)
@@ -25248,6 +25242,8 @@ gen_compile_unit_die (const char *filename)
 	{
 	  if (strcmp (language_string, "GNU Go") == 0)
 	language = DW_LANG_Go;
+	  else if (strcmp (language_string, "GNU Rust") == 0)
+	language = DW_LANG_Rust;
 	}
 }
   /* Use a degraded Fortran setting in strict DWARF2 so is_fortran works.  */
diff --git a/gcc/testsuite/rust/debug/oldlang.rs b/gcc/testsuite/rust/debug/oldlang.rs
index ddacf0e4392..648d6b78f06 100644
--- a/gcc/testsuite/rust/debug/oldlang.rs
+++ b/gcc/testsuite/rust/debug/oldlang.rs
@@ -1,6 +1,6 @@
 fn main () {
 // { dg-do compile }
 // { dg-options "-gstrict-dwarf -gdwarf-3 -dA" }
-// DW_LANG_Rust_old is 0x9000
-// { dg-final { scan-assembler "0x9000\[ \t]\[^\n\r]* DW_AT_language" } } */
+// Strict DWARF < 5 uses DW_LANG_C = 0x0002
+// { dg-final { scan-assembler "0x2\[ \t]\[^\n\r]* DW_AT_language" } } */
 }
-- 
2.30.2



Re: [PATCH Rust front-end v3 01/46] Use DW_ATE_UTF for the Rust 'char' type

2022-10-30 Thread Jakub Jelinek via Gcc-patches
On Sun, Oct 30, 2022 at 04:22:34PM +0100, Mark Wielaard wrote:
> Hi,
> 
> On Wed, Oct 26, 2022 at 10:39:09AM +0200, Jakub Jelinek wrote:
> > I must say I don't understand nor like this DW_LANG_Rust_old stuff at all.
> > Other languages don't do similar dances.
> > Look for D, or Go.  Neither of them has any non-standard lang code as
> > fallback, they use the DWARF assigned DW_LANG_* code, and DW_LANG_C as
> > fallback.  On most arches, DWARF 5 is the default anyway, or non-strict
> > DWARF at least.  Where neither is enabled because of prehistoric or buggy
> > DWARF consumers, it is unlikely they'd handle Rust sanely anyway.
> > Just follow what Go does in the same function.
> 
> DW_LANG_Rust_old was used by old rustc compilers <= 2016 before DWARF5
> assigned an official number. It might be recognized by some
> debuggers. But I agree that these days it doesn't really make sense to
> emit it. When producing strict DWARF it is also slightly odd to emit a
> non-standard language code. So I agree that it makes sense to do what
> Go does, always emit DW_LANG_Rust unless we emit strict DWARF for
> versions before 5 (and then just fall back to DW_LANG_C).
> 
> The attached patch (against "upstream gccrs") does that. I kept the
> oldlang.rs testcase just to see that the -gstrict-dwarf -gdwarf-3 case
> does something sane.
> 
> The only "issue" is that is_rust () depends on the comp_unit_die
> DW_AT_language being DW_LANG_Rust. But the only usage of is_rust
> already depends on strict DWARF.
> 
> https://code.wildebeest.org/git/user/mjw/gccrs/commit/?h=no-Rust-old
> if someone wants to push that, to merge for a v4.

LGTM, thanks.

Jakub



Re: [PATCH] Fortran: ordering of hidden procedure arguments [PR107441]

2022-10-30 Thread Mikael Morin

Le 28/10/2022 à 22:12, Harald Anlauf via Fortran a écrit :

Dear all,

the passing of procedure arguments in Fortran sometimes requires
ancillary parameters that are "hidden".  Examples are string length
and the presence status of scalar variables with optional+value
attribute.

The gfortran ABI is actually documented:

https://gcc.gnu.org/onlinedocs/gfortran/Argument-passing-conventions.html

The reporter found that there was a discrepancy between the
caller and the callee.  This is corrected by the attached patch.


Hello,

I think some discrepancy remains, as gfc_conv_procedure_call accumulates 
coarray stuff into the stringargs, while your change accumulates the 
associated parameter decls separately into hidden_arglist.  It's not 
completely clear to me whether it is really problematic (string length 
and coarray metadata are both integers anyway), but I suspect it is.


Another probable issue is your change to create_function_arglist changes 
arglist/hidden_arglist without also changing typelist/hidden_typelist 
accordingly.  I think a change to gfc_get_function_type is also 
necessary: as the function decl is changed, the decl type need to be 
changed as well.


I will see whether I can manage to exhibit testcases for these issues.



Re: [PATCH] Fortran: ordering of hidden procedure arguments [PR107441]

2022-10-30 Thread Mikael Morin

Le 30/10/2022 à 20:23, Mikael Morin a écrit :


I think some discrepancy remains, as gfc_conv_procedure_call accumulates 
coarray stuff into the stringargs, while your change accumulates the 
associated parameter decls separately into hidden_arglist.  It's not 
completely clear to me whether it is really problematic (string length 
and coarray metadata are both integers anyway), but I suspect it is.

> I will see whether I can manage to exhibit testcases for these issues.


Here is a -fcoarray=lib example.  It can be placed in gfortran/coarray.

! { dg-do run }
!
! PR fortran/107441
! Check that with -fcoarray=lib, coarray metadata arguments are passed
! in the right order to procedures.
program p
  integer :: ci[*]
  ci = 17
  call s(1, ci, "abcd")
contains
  subroutine s(ra, ca, c)
integer :: ra, ca[*]
character(*) :: c
ca[1] = 13
if (ra /= 1) stop 1
if (this_image() == 1) then
  if (ca /= 13) stop 2
else
  if (ca /= 17) stop 3
end if
if (len(c) /= 4) stop 4
if (c /= "abcd") stop 5
  end subroutine s
end program p



Re: [PATCH] Fortran: ordering of hidden procedure arguments [PR107441]

2022-10-30 Thread Mikael Morin

Le 30/10/2022 à 20:23, Mikael Morin a écrit :
Another probable issue is your change to create_function_arglist changes 
arglist/hidden_arglist without also changing typelist/hidden_typelist 
accordingly.  I think a change to gfc_get_function_type is also 
necessary: as the function decl is changed, the decl type need to be 
changed as well.


I will see whether I can manage to exhibit testcases for these issues.


Here is a test for the type vs decl mismatch.

! { dg-do run }
!
! PR fortran/107441
! Check that procedure types and procedure decls match when the procedure
! has both chaacter-typed and optional value args.

program p
  interface
subroutine i(c, o)
  character(*) :: c
  integer, optional, value :: o
end subroutine i
  end interface
  procedure(i), pointer :: pp
  call pp("abcd")
contains
  subroutine s(c, o)
character(*) :: c
integer, optional, value :: o
if (present(o)) stop 1
if (len(c) /= 4) stop 2
if (c /= "abcd") stop 3
  end subroutine s
end program p



RE: [PATCH 4/6] Support Intel AVX-NE-CONVERT

2022-10-30 Thread Liu, Hongtao via Gcc-patches


> -Original Message-
> From: Kong, Lingling 
> Sent: Friday, October 28, 2022 4:57 PM
> To: Hongtao Liu 
> Cc: Liu, Hongtao ; gcc-patches@gcc.gnu.org; Jiang,
> Haochen 
> Subject: RE: [PATCH 4/6] Support Intel AVX-NE-CONVERT
> 
> Hi,
> 
> Because we  switch intrinsics for avx512bf16 to the new type __bf16. Now we
> could use m128/256bh for vector bf16 type instead of m128/256bf16.
> And unified builtin for avx512bf16/avxneconvert.
Ok.
> 
> Thanks,
> Lingling
> 
> > -Original Message-
> > From: Hongtao Liu 
> > Sent: Tuesday, October 25, 2022 1:23 PM
> > To: Kong, Lingling 
> > Cc: Liu, Hongtao ; gcc-patches@gcc.gnu.org;
> > Jiang, Haochen 
> > Subject: Re: [PATCH 4/6] Support Intel AVX-NE-CONVERT
> >
> > On Mon, Oct 24, 2022 at 2:20 PM Kong, Lingling
> > 
> > wrote:
> > >
> > > > From: Gcc-patches
> > > > 
> > > > On Behalf Of Hongtao Liu via Gcc-patches
> > > > Sent: Monday, October 17, 2022 1:47 PM
> > > > To: Jiang, Haochen 
> > > > Cc: Liu, Hongtao ; gcc-patches@gcc.gnu.org
> > > > Subject: Re: [PATCH 4/6] Support Intel AVX-NE-CONVERT
> > > >
> > > > On Fri, Oct 14, 2022 at 3:58 PM Haochen Jiang via Gcc-patches
> > > >  wrote:
> > > > >
> > > > > From: Kong Lingling 
> > > > > +(define_insn "vbcstne2ps_"
> > > > > +  [(set (match_operand:VF1_128_256 0 "register_operand" "=x")
> > > > > +(vec_duplicate:VF1_128_256
> > > > > + (unspec:SF
> > > > > +  [(match_operand:HI 1 "memory_operand" "m")]
> > > > > +  VBCSTNE)))]
> > > > > +  "TARGET_AVXNECONVERT"
> > > > > +  "vbcstne2ps\t{%1, %0|%0, %1}"
> > > > > +  [(set_attr "prefix" "vex")
> > > > > +  (set_attr "mode" "")])
> > > > Since jakub has support bf16 software emulation, can we rewrite it
> > > > with general rtl ir without unspec?
> > > > Like (float_extend:SF (match_operand:BF "memory_operand" "m")
> > > > > +
> > > > > +(define_int_iterator VCVTNEBF16
> > > > > +  [UNSPEC_VCVTNEEBF16SF
> > > > > +   UNSPEC_VCVTNEOBF16SF])
> > > > > +
> > > > > +(define_int_attr vcvtnebf16type
> > > > > +  [(UNSPEC_VCVTNEEBF16SF "ebf16")
> > > > > +   (UNSPEC_VCVTNEOBF16SF "obf16")]) (define_insn
> > > > > +"vcvtne2ps_"
> > > > > +  [(set (match_operand:VF1_128_256 0 "register_operand" "=x")
> > > > > +(unspec:VF1_128_256
> > > > > +  [(match_operand: 1 "memory_operand" "m")]
> > > > > + VCVTNEBF16))]
> > > > > +  "TARGET_AVXNECONVERT"
> > > > > +  "vcvtne2ps\t{%1, %0|%0, %1}"
> > > > > +  [(set_attr "prefix" "vex")
> > > > > +   (set_attr "mode" "")])
> > > > Similar for this one and all those patterns below.
> > >
> > > That's great! Thanks for the review!
> > > Now rewrite it without unspec and use float_extend for new define_insn.
> > Ok.
> > >
> > > Thanks
> > > Lingling
> > >
> > >
> >
> >
> > --
> > BR,
> > Hongtao


[PATCH V2] [x86] Fix incorrect digit constraint

2022-10-30 Thread liuhongt via Gcc-patches
>You have a couple of other patterns where operand 1 is matched to
>produce vmovddup insn. These are *avx512f_unpcklpd512 and
>avx_unpcklpd256. You can also remove expander in both
>cases.

Yes, changed in V2 patch.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?


Matching constraints are used in these circumstances. More precisely,
the two operands that match must include one input-only operand and
one output-only operand. Moreover, the digit must be a smaller number
than the number of the operand that uses it in the constraint.

In pr107057, the 2 operands in the pattern are both input operands.

gcc/ChangeLog:

PR target/107057
* config/i386/sse.md (*vec_interleave_highv2df): Remove
constraint 1.
(*vec_interleave_lowv2df): Ditto.
(vec_concatv2df): Ditto.
(*avx512f_unpcklpd512): Ditto and renamed to ..
(avx512f_unpcklpd512): .. this.
(avx512f_movddup512): Change to define_insn.
(avx_movddup256): Ditto.
(*avx_unpcklpd256): Remove constraint 1 and renamed
to ..
(avx_unpcklpd256): .. this.
* config/i386/i386.cc (ix86_vec_interleave_v2df_operator_ok):
Disallow MEM_P (op1) && MEM_P (op2).

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr107057.c: New test.
---
 gcc/config/i386/i386.cc  |   2 +-
 gcc/config/i386/sse.md   | 140 +--
 gcc/testsuite/gcc.target/i386/pr107057.c |  19 +++
 3 files changed, 77 insertions(+), 84 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr107057.c

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index aeea26ef4be..e3b7bea0d68 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -15652,7 +15652,7 @@ ix86_vec_interleave_v2df_operator_ok (rtx operands[3], 
bool high)
   if (MEM_P (operands[0]))
 return rtx_equal_p (operands[0], operands[1 + high]);
   if (MEM_P (operands[1]) && MEM_P (operands[2]))
-return TARGET_SSE3 && rtx_equal_p (operands[1], operands[2]);
+return false;
   return true;
 }
 
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index f4b5506703f..b7922521734 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -12170,107 +12170,88 @@ (define_expand "vec_interleave_highv2df"
 })
 
 (define_insn "*vec_interleave_highv2df"
-  [(set (match_operand:V2DF 0 "nonimmediate_operand" "=x,v,v,x,v,m")
+  [(set (match_operand:V2DF 0 "nonimmediate_operand" "=x,v,x,v,m")
(vec_select:V2DF
  (vec_concat:V4DF
-   (match_operand:V2DF 1 "nonimmediate_operand" " 0,v,o,o,o,v")
-   (match_operand:V2DF 2 "nonimmediate_operand" " x,v,1,0,v,0"))
+   (match_operand:V2DF 1 "nonimmediate_operand" " 0,v,o,o,v")
+   (match_operand:V2DF 2 "nonimmediate_operand" " x,v,0,v,0"))
  (parallel [(const_int 1)
 (const_int 3)])))]
   "TARGET_SSE2 && ix86_vec_interleave_v2df_operator_ok (operands, 1)"
   "@
unpckhpd\t{%2, %0|%0, %2}
vunpckhpd\t{%2, %1, %0|%0, %1, %2}
-   %vmovddup\t{%H1, %0|%0, %H1}
movlpd\t{%H1, %0|%0, %H1}
vmovlpd\t{%H1, %2, %0|%0, %2, %H1}
%vmovhpd\t{%1, %0|%q0, %1}"
-  [(set_attr "isa" "noavx,avx,sse3,noavx,avx,*")
-   (set_attr "type" "sselog,sselog,sselog,ssemov,ssemov,ssemov")
+  [(set_attr "isa" "noavx,avx,noavx,avx,*")
+   (set_attr "type" "sselog,sselog,ssemov,ssemov,ssemov")
(set (attr "prefix_data16")
- (if_then_else (eq_attr "alternative" "3,5")
+ (if_then_else (eq_attr "alternative" "2,4")
   (const_string "1")
   (const_string "*")))
-   (set_attr "prefix" "orig,maybe_evex,maybe_vex,orig,maybe_evex,maybe_vex")
-   (set_attr "mode" "V2DF,V2DF,DF,V1DF,V1DF,V1DF")])
+   (set_attr "prefix" "orig,maybe_evex,orig,maybe_evex,maybe_vex")
+   (set_attr "mode" "V2DF,V2DF,V1DF,V1DF,V1DF")])
 
-(define_expand "avx512f_movddup512"
-  [(set (match_operand:V8DF 0 "register_operand")
+(define_insn "avx512f_movddup512"
+  [(set (match_operand:V8DF 0 "register_operand" "=v")
(vec_select:V8DF
  (vec_concat:V16DF
-   (match_operand:V8DF 1 "nonimmediate_operand")
+   (match_operand:V8DF 1 "memory_operand" "m")
(match_dup 1))
  (parallel [(const_int 0) (const_int 8)
 (const_int 2) (const_int 10)
 (const_int 4) (const_int 12)
 (const_int 6) (const_int 14)])))]
-  "TARGET_AVX512F")
-
-(define_expand "avx512f_unpcklpd512"
-  [(set (match_operand:V8DF 0 "register_operand")
-   (vec_select:V8DF
- (vec_concat:V16DF
-   (match_operand:V8DF 1 "register_operand")
-   (match_operand:V8DF 2 "nonimmediate_operand"))
- (parallel [(const_int 0) (const_int 8)
-(const_int 2) (const_int 10)
-(const_int 4) (const_int 12)
-(const_int 6) (const_int 14)])))]
-  "TARGET_AVX512F")
+  "TARGET_AVX5

[PATCH] Enable more optimization for 32-bit/64-bit shrd/shld with imm shift count.

2022-10-30 Thread liuhongt via Gcc-patches
This patch doens't handle variable count since it require 5 insns to
be combined to get wanted pattern, but current pass_combine only
supports at most 4.
This patch doesn't handle 16-bit shrd/shld either.

Ideally, we can avoid redundancy of
*x86_64_shld_shrd_1_nozext/*x86_shld_shrd_1_nozext
if middle end could recognize they're just variants of the
*x86_64_shrd_shld_1_nozext/*x86_shrd_shld_1_nozext with ashift/lshiftrt swapped
in the ior which is commutative. But currently it doesn't, so I add both of
them in the patch.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?


gcc/ChangeLog:

PR target/55583
* config/i386/i386.md (*x86_64_shld_1): Rename to ..
(x86_64_shld_1): .. this.
(*x86_shld_1): Rename to ..
(x86_shld_1): .. this.
(*x86_64_shrd_1): Rename to ..
(x86_64_shrd_1): .. this.
(*x86_shrd_1): Rename to ..
(x86_shrd_1): .. this.
(*x86_64_shld_shrd_1_nozext): New pre_reload splitter.
(*x86_shld_shrd_1_nozext): Ditto.
(*x86_64_shrd_shld_1_nozext): Ditto.
(*x86_shrd_shld_1_nozext): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr55583.c: New test.
---
 gcc/config/i386/i386.md | 150 +++-
 gcc/testsuite/gcc.target/i386/pr55583.c |  27 +
 2 files changed, 173 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr55583.c

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index baf1f1f8fa2..a3ac319f0d7 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -12470,7 +12470,7 @@ (define_insn "x86_64_shld"
(set_attr "amdfam10_decode" "vector")
(set_attr "bdver1_decode" "vector")])
 
-(define_insn "*x86_64_shld_1"
+(define_insn "x86_64_shld_1"
   [(set (match_operand:DI 0 "nonimmediate_operand" "+r*m")
 (ior:DI (ashift:DI (match_dup 0)
   (match_operand:QI 2 "const_0_to_63_operand"))
@@ -12491,6 +12491,42 @@ (define_insn "*x86_64_shld_1"
(set_attr "amdfam10_decode" "vector")
(set_attr "bdver1_decode" "vector")])
 
+(define_insn_and_split "*x86_64_shld_shrd_1_nozext"
+  [(set (match_operand:DI 0 "nonimmediate_operand")
+   (ior:DI (ashift:DI (match_operand:DI 4 "nonimmediate_operand")
+(match_operand:QI 2 "const_0_to_63_operand"))
+   (lshiftrt:DI
+ (match_operand:DI 1 "nonimmediate_operand")
+ (match_operand:QI 3 "const_0_to_63_operand"
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_64BIT
+   && INTVAL (operands[3]) == 64 - INTVAL (operands[2])
+   && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  if (rtx_equal_p (operands[4], operands[0]))
+{
+  operands[1] = force_reg (DImode, operands[1]);
+  emit_insn (gen_x86_64_shld_1 (operands[0], operands[1], operands[2], 
operands[3]));
+}
+  else if (rtx_equal_p (operands[1], operands[0]))
+{
+  operands[4] = force_reg (DImode, operands[4]);
+  emit_insn (gen_x86_64_shrd_1 (operands[0], operands[4], operands[3], 
operands[2]));
+}
+  else
+   {
+ operands[1] = force_reg (DImode, operands[1]);
+ rtx tmp = gen_reg_rtx (DImode);
+ emit_move_insn (tmp, operands[4]);
+ emit_insn (gen_x86_64_shld_1 (tmp, operands[1], operands[2], 
operands[3]));
+ emit_move_insn (operands[0], tmp);
+   }
+   DONE;
+})
+
 (define_insn_and_split "*x86_64_shld_2"
   [(set (match_operand:DI 0 "nonimmediate_operand")
(ior:DI (ashift:DI (match_dup 0)
@@ -12534,7 +12570,7 @@ (define_insn "x86_shld"
(set_attr "amdfam10_decode" "vector")
(set_attr "bdver1_decode" "vector")])
 
-(define_insn "*x86_shld_1"
+(define_insn "x86_shld_1"
   [(set (match_operand:SI 0 "nonimmediate_operand" "+r*m")
 (ior:SI (ashift:SI (match_dup 0)
   (match_operand:QI 2 "const_0_to_31_operand"))
@@ -12555,6 +12591,41 @@ (define_insn "*x86_shld_1"
(set_attr "amdfam10_decode" "vector")
(set_attr "bdver1_decode" "vector")])
 
+(define_insn_and_split "*x86_shld_shrd_1_nozext"
+  [(set (match_operand:SI 0 "nonimmediate_operand")
+   (ior:SI (ashift:SI (match_operand:SI 4 "nonimmediate_operand")
+(match_operand:QI 2 "const_0_to_31_operand"))
+  (lshiftrt:SI
+  (match_operand:SI 1 "nonimmediate_operand")
+  (match_operand:QI 3 "const_0_to_31_operand"
+   (clobber (reg:CC FLAGS_REG))]
+  "INTVAL (operands[3]) == 32 - INTVAL (operands[2])
+   && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  if (rtx_equal_p (operands[4], operands[0]))
+{
+  operands[1] = force_reg (SImode, operands[1]);
+  emit_insn (gen_x86_shld_1 (operands[0], operands[1], operands[2], 
operands[3]));
+}
+  else if (rtx_equal_p (operands[1], operands[0]))
+{
+  operands[4] = force_reg (SImode, operands[4]);
+  emit_insn (gen_x86_shrd_1 (operands[0], oper

[PATCH] RISC-V: Fix RVV testcases.

2022-10-30 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/abi-2.c: Change ilp32d to ilp32.
* gcc.target/riscv/rvv/base/abi-3.c: Ditto.
* gcc.target/riscv/rvv/base/abi-4.c: Ditto.
* gcc.target/riscv/rvv/base/abi-5.c: Ditto.
* gcc.target/riscv/rvv/base/abi-6.c: Ditto.
* gcc.target/riscv/rvv/base/abi-7.c: Ditto.
* gcc.target/riscv/rvv/base/mov-1.c: Ditto.
* gcc.target/riscv/rvv/base/mov-10.c: Ditto.
* gcc.target/riscv/rvv/base/mov-11.c: Ditto.
* gcc.target/riscv/rvv/base/mov-12.c: Ditto.
* gcc.target/riscv/rvv/base/mov-13.c: Ditto.
* gcc.target/riscv/rvv/base/mov-2.c: Ditto.
* gcc.target/riscv/rvv/base/mov-3.c: Ditto.
* gcc.target/riscv/rvv/base/mov-4.c: Ditto.
* gcc.target/riscv/rvv/base/mov-5.c: Ditto.
* gcc.target/riscv/rvv/base/mov-6.c: Ditto.
* gcc.target/riscv/rvv/base/mov-7.c: Ditto.
* gcc.target/riscv/rvv/base/mov-8.c: Ditto.
* gcc.target/riscv/rvv/base/mov-9.c: Ditto.
* gcc.target/riscv/rvv/base/pragma-1.c: Ditto.
* gcc.target/riscv/rvv/base/user-1.c: Ditto.
* gcc.target/riscv/rvv/base/user-2.c: Ditto.
* gcc.target/riscv/rvv/base/user-3.c: Ditto.
* gcc.target/riscv/rvv/base/user-4.c: Ditto.
* gcc.target/riscv/rvv/base/user-5.c: Ditto.
* gcc.target/riscv/rvv/base/user-6.c: Ditto.
* gcc.target/riscv/rvv/base/vsetvl-1.c: Ditto.

---
 gcc/testsuite/gcc.target/riscv/rvv/base/abi-2.c| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/abi-3.c| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/abi-4.c| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/abi-5.c| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/abi-6.c| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/abi-7.c| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/mov-1.c| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/mov-10.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/mov-11.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/mov-12.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/mov-13.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/mov-2.c| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/mov-3.c| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/mov-4.c| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/mov-5.c| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/mov-6.c| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/mov-7.c| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/mov-8.c| 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/mov-9.c| 8 
 gcc/testsuite/gcc.target/riscv/rvv/base/pragma-1.c | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/user-1.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/user-2.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/user-3.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/user-4.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/user-5.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/user-6.c   | 2 +-
 gcc/testsuite/gcc.target/riscv/rvv/base/vsetvl-1.c | 2 +-
 27 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-2.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-2.c
index 92e61c255ac..9cd94c99308 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-2.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -march=rv32gc -mabi=ilp32d" } */
+/* { dg-options "-O3 -march=rv32gc -mabi=ilp32" } */
 
 void foo0 () {__rvv_bool64_t t;} /* { dg-error {unknown type name 
'__rvv_bool64_t'} } */
 void foo1 () {__rvv_bool32_t t;} /* { dg-error {unknown type name 
'__rvv_bool32_t'} } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-3.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-3.c
index b9adb3072f6..628a2753202 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-3.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -march=rv32gc_zve64x -mabi=ilp32d" } */
+/* { dg-options "-O3 -march=rv32gc_zve64x -mabi=ilp32" } */
 
 void foo0 () {__rvv_bool64_t t;}
 void foo1 () {__rvv_bool32_t t;}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-4.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-4.c
index 56a0ebed477..b4557aa6939 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-4.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O3 -march=rv32gc_zve64f -mabi=ilp32d" } */
+/* { dg-options "-O3 -march=rv32gc_zve64f -mabi=ilp32" } */
 
 void foo0 () {__rvv_bool64_t t;}
 void foo1 () {__rvv_bool32_t t;}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-5.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-5.c
index af716094491..a58167f29ab 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/base/abi-5.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/abi-5.c
@@ -1,5 +1,5 @@
 /* { d

Re: Re: [PATCH] RISC-V: Support load/store in mov pattern for RVV modes.

2022-10-30 Thread juzhe.zh...@rivai.ai
Hi, since these RVV testcases doesn't necessary need abi configuration.
I fix these testcase in this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604642.html 
Plz, verify it and merge it. Thanks.



juzhe.zh...@rivai.ai
 
From: Andreas Schwab
Date: 2022-10-30 19:00
To: juzhe.zhong
CC: gcc-patches; kito.cheng
Subject: Re: [PATCH] RISC-V: Support load/store in mov pattern for RVV 
modes.
On Okt 24 2022, juzhe.zh...@rivai.ai wrote:
 
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-10.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-11.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-12.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-13.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-6.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-7.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-8.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/mov-9.c
 
They all fail if the ilp32d ABI is not available.
 
-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."
 


Re: Re: [PATCH] RISC-V: Add RVV vsetvl/vsetvlmax intrinsics and tests.

2022-10-30 Thread juzhe.zh...@rivai.ai
Hi, since these RVV testcases doesn't necessary need abi configuration.
I fix these testcase in this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604642.html 
Plz, verify it and merge it. Thanks.



juzhe.zh...@rivai.ai
 
From: Andreas Schwab
Date: 2022-10-30 19:02
To: juzhe.zhong
CC: gcc-patches; kito.cheng
Subject: Re: [PATCH] RISC-V: Add RVV vsetvl/vsetvlmax intrinsics and tests.
On Okt 17 2022, juzhe.zh...@rivai.ai wrote:
 
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/vsetvl-1.c: New test.
 
This fails if the ilp32d ABI is not available.
 
-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."
 


RE: Ping^3 [PATCH V2] Add attribute hot judgement for INLINE_HINT_known_hot hint.

2022-10-30 Thread Cui, Lili via Gcc-patches
> 
> On 10/20/22 19:52, Cui, Lili via Gcc-patches wrote:
> > Hi Honza,
> >
> > Gentle ping
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601934.html
> >
> > gcc/ChangeLog
> >
> >* ipa-inline-analysis.cc (do_estimate_edge_time): Add function attribute
> >judgement for INLINE_HINT_known_hot hint.
> >
> > gcc/testsuite/ChangeLog:
> >
> >* gcc.dg/ipa/inlinehint-6.c: New test.
> > ---
> >   gcc/ipa-inline-analysis.cc  | 13 ---
> >   gcc/testsuite/gcc.dg/ipa/inlinehint-6.c | 47
> +
> >   2 files changed, 56 insertions(+), 4 deletions(-)
> >   create mode 100644 gcc/testsuite/gcc.dg/ipa/inlinehint-6.c
> >
> > diff --git a/gcc/ipa-inline-analysis.cc b/gcc/ipa-inline-analysis.cc
> > index 1ca685d1b0e..7bd29c36590 100644
> > --- a/gcc/ipa-inline-analysis.cc
> > +++ b/gcc/ipa-inline-analysis.cc
> > @@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
> >   #include "ipa-utils.h"
> >   #include "cfgexpand.h"
> >   #include "gimplify.h"
> > +#include "attribs.h"
> >
> >   /* Cached node/edge growths.  */
> >   fast_call_summary
> > *edge_growth_cache = NULL; @@ -249,15 +250,19 @@
> do_estimate_edge_time (struct cgraph_edge *edge, sreal
> *ret_nonspec_time)
> > hints = estimates.hints;
> >   }
> >
> > -  /* When we have profile feedback, we can quite safely identify hot
> > - edges and for those we disable size limits.  Don't do that when
> > - probability that caller will call the callee is low however, since it
> > +  /* When we have profile feedback or function attribute, we can quite
> safely
> > + identify hot edges and for those we disable size limits.  Don't do 
> > that
> > + when probability that caller will call the callee is low
> > + however, since it
> >may hurt optimization of the caller's hot path.  */
> > -  if (edge->count.ipa ().initialized_p () && edge->maybe_hot_p ()
> > +  if ((edge->count.ipa ().initialized_p () && edge->maybe_hot_p ()
> > && (edge->count.ipa () * 2
> >   > (edge->caller->inlined_to
> >  ? edge->caller->inlined_to->count.ipa ()
> >  : edge->caller->count.ipa (
> > +  || (lookup_attribute ("hot", DECL_ATTRIBUTES (edge->caller->decl))
> > + != NULL
> > +&& lookup_attribute ("hot", DECL_ATTRIBUTES (edge->callee->decl))
> > + != NULL))
> >   hints |= INLINE_HINT_known_hot;
> 
> Is the theory here that if the user has marked the caller and callee as hot,
> then we're going to assume an edge between them is hot too?  That's not
> necessarily true, it could be they're both hot, but via other call chains.  
> But it's
> probably a reasonable heuristic in practice.
> 
Yes,  thanks Jeff.

Lili.
> 
> OK
> 
> 
> jeff
> 



[RFC] propgation leap over memory copy for struct

2022-10-30 Thread Jiufu Guo via Gcc-patches
Hi,

We know that for struct variable assignment, memory copy may be used.
And for memcpy, we may load and store more bytes as possible at one time.
While it may be not best here:
1. Before/after stuct variable assignment, the vaiable may be operated.
And it is hard for some optimizations to leap over memcpy.  Then some struct
operations may be sub-optimimal.  Like the issue in PR65421.
2. The size of struct is constant mostly, the memcpy would be expanded.  Using
small size to load/store and executing in parallel may not slower than using
large size to loat/store. (sure, more registers may be used for smaller bytes.)


In PR65421, For source code as below:
t.c
#define FN 4
typedef struct { double a[FN]; } A;

A foo (const A *a) { return *a; }
A bar (const A a) { return a; }
///

If FN<=2; the size of "A" fits into TImode, then this code can be optimized 
(by subreg/cse/fwprop/cprop) as:
---
foo:
.LFB0:
.cfi_startproc
blr

bar:
.LFB1:
.cfi_startproc
lfd 2,8(3)
lfd 1,0(3)
blr

If the size of "A" is larger than any INT mode size, RTL insns would be 
generated as:
   13: r125:V2DI=[r112:DI+0x20]
   14: r126:V2DI=[r112:DI+0x30]
   15: [r112:DI]=r125:V2DI
   16: [r112:DI+0x10]=r126:V2DI  /// memcpy for assignment: D.3338 = arg;
   17: r127:DF=[r112:DI]
   18: r128:DF=[r112:DI+0x8]
   19: r129:DF=[r112:DI+0x10]
   20: r130:DF=[r112:DI+0x18]


I'm thinking about ways to improve this.
Metod1: One way may be changing the memory copy by referencing the type 
of the struct if the size of struct is not too big. And generate insns 
like the below:
   13: r125:DF=[r112:DI+0x20]
   15: r126:DF=[r112:DI+0x28]
   17: r127:DF=[r112:DI+0x30]
   19: r128:DF=[r112:DI+0x38]
   14: [r112:DI]=r125:DF
   16: [r112:DI+0x8]=r126:DF
   18: [r112:DI+0x10]=r127:DF
   20: [r112:DI+0x18]=r128:DF
   21: r129:DF=[r112:DI]
   22: r130:DF=[r112:DI+0x8]
   23: r131:DF=[r112:DI+0x10]
   24: r132:DF=[r112:DI+0x18]

Then passes (cse, prop, dse...) could help to optimize the code.
Concerns of the method: we may not need to do this if the number of 
fields is too large.  And the types/modes of each load/store may
depend on the platform and not same with the type of the fields of
the struct. For example: 
For "struct {double a[3]; long long l;}", on ppc64le, DImode may be
 better for assignments on parameter.


Method2: One way may be enhancing CSE to make it able to treat one large
memory slot as two(or more) combined slots: 
   13: r125:V2DI#0=[r112:DI+0x20]
   13': r125:V2DI#8=[r112:DI+0x28]
   15: [r112:DI]#0=r125:V2DI#0
   15': [r112:DI]#8=r125:V2DI#8

This may seems more hack in CSE.


Method3: For some record type, use "PARALLEL:BLK" instead "MEM:BLK".
To do this, "moving" between "PARALLEL<->PARALLEL" and "PARALLEL<->MEM" 
may need to be enhanced.  This method may require more effort to make
it works for corner/unknown cases.

I'm wondering which would be more flexible to handle this issue?
Thanks for any comments and suggestions!

BR,
Jeff(Jiufu)