[PATCH] LoongArch: Fix the missing include file when using gcc plugins.

2023-07-11 Thread Guo Jie
From: Sun Haiyong 

gcc/ChangeLog:

* config.gcc: Add some include file in tm_file.

---
 gcc/config.gcc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 51ca5311fa4..b901aa8e5dc 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2478,7 +2478,7 @@ riscv*-*-freebsd*)
 
 loongarch*-*-linux*)
tm_file="elfos.h gnu-user.h linux.h linux-android.h glibc-stdint.h 
${tm_file}"
-   tm_file="${tm_file} loongarch/gnu-user.h loongarch/linux.h"
+   tm_file="${tm_file} loongarch/gnu-user.h loongarch/linux.h 
loongarch/loongarch-def.h loongarch/loongarch-tune.h 
loongarch/loongarch-driver.h"
extra_options="${extra_options} linux-android.opt"
tmake_file="${tmake_file} loongarch/t-linux"
gnu_ld=yes
-- 
2.20.1



[PATCH] Loongarch: Fix plugin header missing install.

2023-08-15 Thread Guo Jie
gcc/ChangeLog:

* config/loongarch/t-loongarch: Add loongarch-driver.h into
TM_H. Add loongarch-def.h and loongarch-tune.h into
OPTIONS_H_EXTRA.

Co-authored-by: Lulu Cheng 
---
 gcc/config/loongarch/t-loongarch | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/config/loongarch/t-loongarch b/gcc/config/loongarch/t-loongarch
index 6d6e3435d59..e73f4f437ef 100644
--- a/gcc/config/loongarch/t-loongarch
+++ b/gcc/config/loongarch/t-loongarch
@@ -16,6 +16,10 @@
 # along with GCC; see the file COPYING3.  If not see
 # .
 
+TM_H += $(srcdir)/config/loongarch/loongarch-driver.h
+OPTIONS_H_EXTRA += $(srcdir)/config/loongarch/loongarch-def.h \
+  $(srcdir)/config/loongarch/loongarch-tune.h
+
 # Canonical target triplet from config.gcc
 LA_MULTIARCH_TRIPLET = $(patsubst LA_MULTIARCH_TRIPLET=%,%,$\
 $(filter LA_MULTIARCH_TRIPLET=%,$(tm_defines)))
-- 
2.20.1



[PATCH] LoongArch: Fix inconsistent description in *sge_

2024-03-03 Thread Guo Jie
The constraint of op[1] is inconsistent with the output template.

gcc/ChangeLog:

* config/loongarch/loongarch.md
(define_insn "*sge_"): Fix inconsistency
error.

---
 gcc/config/loongarch/loongarch.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index f3b5c641fce..2d25374bdc9 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -3357,10 +3357,10 @@ (define_insn "*sgt_"
 
 (define_insn "*sge_"
   [(set (match_operand:GPR 0 "register_operand" "=r")
-   (any_ge:GPR (match_operand:X 1 "register_operand" "r")
+   (any_ge:GPR (match_operand:X 1 "arith_operand" "rI")
 (const_int 1)))]
   ""
-  "slti\t%0,%.,%1"
+  "slt%i1\t%0,%.,%1"
   [(set_attr "type" "slt")
(set_attr "mode" "")])
 
-- 
2.20.1



Re: [PATCH] LoongArch: Fix inconsistent description in *sge_

2024-03-04 Thread Guo Jie

Thanks for the feedback.

The comparison between a const_imm12_operand and (const_int 1) does indeed

perform a universal process of constant folding before any tree based 
optimization.


I will fix it in patch v2.


在 2024/3/4 下午5:18, Xi Ruoyao 写道:

On Mon, 2024-03-04 at 11:03 +0800, Guo Jie wrote:

The constraint of op[1] is inconsistent with the output template.

gcc/ChangeLog:

* config/loongarch/loongarch.md
(define_insn "*sge_"): Fix inconsistency
error.

---
  gcc/config/loongarch/loongarch.md | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/loongarch/loongarch.md
b/gcc/config/loongarch/loongarch.md
index f3b5c641fce..2d25374bdc9 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -3357,10 +3357,10 @@ (define_insn "*sgt_"
  
  (define_insn "*sge_"

    [(set (match_operand:GPR 0 "register_operand" "=r")
-   (any_ge:GPR (match_operand:X 1 "register_operand" "r")
+   (any_ge:GPR (match_operand:X 1 "arith_operand" "rI")
     (const_int 1)))]

No, arith_operand is just register_operand or const_imm12_operand, but
comparing a const_imm12_operand with (const_int 1) should be folded into
a constant (even at -O0, AFAIK).  So allowing const_imm12_operand here
makes no benefit.


    ""
-  "slti\t%0,%.,%1"
+  "slt%i1\t%0,%.,%1"
    [(set_attr "type" "slt")
     (set_attr "mode" "")])
  


[PATCH v2] LoongArch: Fix inconsistent description in *sge_

2024-03-05 Thread Guo Jie
The constraint of op[1] is inconsistent with the output template.

gcc/ChangeLog:

* config/loongarch/loongarch.md
(define_insn "*sge_"): Fix inconsistency
error.

---
Update in v2:
Remove useless support for op[1] is const_imm12_operand.

---
 gcc/config/loongarch/loongarch.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index f3b5c641fce..e35a001e0ed 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -3360,7 +3360,7 @@ (define_insn "*sge_"
(any_ge:GPR (match_operand:X 1 "register_operand" "r")
 (const_int 1)))]
   ""
-  "slti\t%0,%.,%1"
+  "slt\t%0,%.,%1"
   [(set_attr "type" "slt")
(set_attr "mode" "")])
 
-- 
2.20.1



[PATCH] LoongArch: Support loading floating-point zero into MEM[base + index].

2023-09-01 Thread Guo Jie
gcc/ChangeLog:

* config/loongarch/loongarch.md: Support 'G' -> 'k' in
movsf_hardfloat and movdf_hardfloat.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/const-double-zero-stx.c: New test.

---
 gcc/config/loongarch/loongarch.md  | 12 ++--
 .../loongarch/const-double-zero-stx.c  | 18 ++
 2 files changed, 24 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c

diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index b37e070660f..6f47c23a79c 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -1915,13 +1915,13 @@ (define_expand "movsf"
 })
 
 (define_insn "*movsf_hardfloat"
-  [(set (match_operand:SF 0 "nonimmediate_operand" 
"=f,f,f,m,f,k,m,*f,*r,*r,*r,*m")
-   (match_operand:SF 1 "move_operand" "f,G,m,f,k,f,G,*r,*f,*G*r,*m,*r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" 
"=f,f,f,m,f,k,m,k,*f,*r,*r,*r,*m")
+   (match_operand:SF 1 "move_operand" "f,G,m,f,k,f,G,G,*r,*f,*G*r,*m,*r"))]
   "TARGET_HARD_FLOAT
&& (register_operand (operands[0], SFmode)
|| reg_or_0_operand (operands[1], SFmode))"
   { return loongarch_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" 
"fmove,mgtf,fpload,fpstore,fpload,fpstore,store,mgtf,mftg,move,load,store")
+  [(set_attr "move_type" 
"fmove,mgtf,fpload,fpstore,fpload,fpstore,store,store,mgtf,mftg,move,load,store")
(set_attr "mode" "SF")])
 
 (define_insn "*movsf_softfloat"
@@ -1946,13 +1946,13 @@ (define_expand "movdf"
 })
 
 (define_insn "*movdf_hardfloat"
-  [(set (match_operand:DF 0 "nonimmediate_operand" 
"=f,f,f,m,f,k,m,*f,*r,*r,*r,*m")
-   (match_operand:DF 1 "move_operand" "f,G,m,f,k,f,G,*r,*f,*r*G,*m,*r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" 
"=f,f,f,m,f,k,m,k,*f,*r,*r,*r,*m")
+   (match_operand:DF 1 "move_operand" "f,G,m,f,k,f,G,G,*r,*f,*r*G,*m,*r"))]
   "TARGET_DOUBLE_FLOAT
&& (register_operand (operands[0], DFmode)
|| reg_or_0_operand (operands[1], DFmode))"
   { return loongarch_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" 
"fmove,mgtf,fpload,fpstore,fpload,fpstore,store,mgtf,mftg,move,load,store")
+  [(set_attr "move_type" 
"fmove,mgtf,fpload,fpstore,fpload,fpstore,store,store,mgtf,mftg,move,load,store")
(set_attr "mode" "DF")])
 
 (define_insn "*movdf_softfloat"
diff --git a/gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c 
b/gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c
new file mode 100644
index 000..8fb04be8ff5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times {stx\..\t\$r0} 2 } } */
+
+extern float arr_f[];
+extern double arr_d[];
+
+void
+test_f (int base, int index)
+{
+  arr_f[base + index] = 0.0;
+}
+
+void
+test_d (int base, int index)
+{
+  arr_d[base + index] = 0.0;
+}
-- 
2.20.1



[PATCH v2] LoongArch: Support storing floating-point zero into MEM[base + index].

2023-09-02 Thread Guo Jie
v2: Modify commit message.

gcc/ChangeLog:

* config/loongarch/loongarch.md: Support 'G' -> 'k' in
movsf_hardfloat and movdf_hardfloat.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/const-double-zero-stx.c: New test.

---
 gcc/config/loongarch/loongarch.md  | 12 ++--
 .../loongarch/const-double-zero-stx.c  | 18 ++
 2 files changed, 24 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c

diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index b37e070660f..6f47c23a79c 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -1915,13 +1915,13 @@ (define_expand "movsf"
 })
 
 (define_insn "*movsf_hardfloat"
-  [(set (match_operand:SF 0 "nonimmediate_operand" 
"=f,f,f,m,f,k,m,*f,*r,*r,*r,*m")
-   (match_operand:SF 1 "move_operand" "f,G,m,f,k,f,G,*r,*f,*G*r,*m,*r"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" 
"=f,f,f,m,f,k,m,k,*f,*r,*r,*r,*m")
+   (match_operand:SF 1 "move_operand" "f,G,m,f,k,f,G,G,*r,*f,*G*r,*m,*r"))]
   "TARGET_HARD_FLOAT
&& (register_operand (operands[0], SFmode)
|| reg_or_0_operand (operands[1], SFmode))"
   { return loongarch_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" 
"fmove,mgtf,fpload,fpstore,fpload,fpstore,store,mgtf,mftg,move,load,store")
+  [(set_attr "move_type" 
"fmove,mgtf,fpload,fpstore,fpload,fpstore,store,store,mgtf,mftg,move,load,store")
(set_attr "mode" "SF")])
 
 (define_insn "*movsf_softfloat"
@@ -1946,13 +1946,13 @@ (define_expand "movdf"
 })
 
 (define_insn "*movdf_hardfloat"
-  [(set (match_operand:DF 0 "nonimmediate_operand" 
"=f,f,f,m,f,k,m,*f,*r,*r,*r,*m")
-   (match_operand:DF 1 "move_operand" "f,G,m,f,k,f,G,*r,*f,*r*G,*m,*r"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" 
"=f,f,f,m,f,k,m,k,*f,*r,*r,*r,*m")
+   (match_operand:DF 1 "move_operand" "f,G,m,f,k,f,G,G,*r,*f,*r*G,*m,*r"))]
   "TARGET_DOUBLE_FLOAT
&& (register_operand (operands[0], DFmode)
|| reg_or_0_operand (operands[1], DFmode))"
   { return loongarch_output_move (operands[0], operands[1]); }
-  [(set_attr "move_type" 
"fmove,mgtf,fpload,fpstore,fpload,fpstore,store,mgtf,mftg,move,load,store")
+  [(set_attr "move_type" 
"fmove,mgtf,fpload,fpstore,fpload,fpstore,store,store,mgtf,mftg,move,load,store")
(set_attr "mode" "DF")])
 
 (define_insn "*movdf_softfloat"
diff --git a/gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c 
b/gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c
new file mode 100644
index 000..8fb04be8ff5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/const-double-zero-stx.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times {stx\..\t\$r0} 2 } } */
+
+extern float arr_f[];
+extern double arr_d[];
+
+void
+test_f (int base, int index)
+{
+  arr_f[base + index] = 0.0;
+}
+
+void
+test_d (int base, int index)
+{
+  arr_d[base + index] = 0.0;
+}
-- 
2.20.1



[PATCH] LoongArch: Enable -fsched-pressure by default at -O1 and higher.

2023-09-07 Thread Guo Jie
gcc/ChangeLog:

* common/config/loongarch/loongarch-common.cc:
(default_options loongarch_option_optimization_table):
Default to -fsched-pressure.

---
 gcc/common/config/loongarch/loongarch-common.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/common/config/loongarch/loongarch-common.cc 
b/gcc/common/config/loongarch/loongarch-common.cc
index c5ed37d27a6..b6901910b70 100644
--- a/gcc/common/config/loongarch/loongarch-common.cc
+++ b/gcc/common/config/loongarch/loongarch-common.cc
@@ -36,6 +36,7 @@ static const struct default_options 
loongarch_option_optimization_table[] =
   { OPT_LEVELS_ALL, OPT_fasynchronous_unwind_tables, NULL, 1 },
   { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
   { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
+  { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
   { OPT_LEVELS_NONE, 0, NULL, 0 }
 };
 
-- 
2.20.1



Re: [PATCH] LoongArch: Enable -fsched-pressure by default at -O1 and higher.

2023-09-08 Thread Guo Jie

Hi,

What I wanna change is "gcc/common/config/loongarch/loongarch-common.cc",

and the patch is automatically generated by "git gcc-commit-mklog".

Is it necessary to  to remove "common/" ?

Thanks for the review.


在 2023/9/8 下午4:06, Xi Ruoyao 写道:

On Fri, 2023-09-08 at 10:00 +0800, Guo Jie wrote:

gcc/ChangeLog:

 * common/config/loongarch/loongarch-common.cc:

"common/" should be removed.  You can use "git gcc-verify" to figure out
this kind of error before sending a patch in the future.


 (default_options loongarch_option_optimization_table):
 Default to -fsched-pressure.

"Default to -fsched-pressure at -O1 or above."

Otherwise OK.


---
  gcc/common/config/loongarch/loongarch-common.cc | 1 +
  1 file changed, 1 insertion(+)

diff --git a/gcc/common/config/loongarch/loongarch-common.cc
b/gcc/common/config/loongarch/loongarch-common.cc
index c5ed37d27a6..b6901910b70 100644
--- a/gcc/common/config/loongarch/loongarch-common.cc
+++ b/gcc/common/config/loongarch/loongarch-common.cc
@@ -36,6 +36,7 @@ static const struct default_options
loongarch_option_optimization_table[] =
    { OPT_LEVELS_ALL, OPT_fasynchronous_unwind_tables, NULL, 1 },
    { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
    { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
+  { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
    { OPT_LEVELS_NONE, 0, NULL, 0 }
  };
  




[PATCH] LoongArch: Optimize the loading of immediate numbers with the same high and low 32-bit values

2023-11-17 Thread Guo Jie
For the following immediate load operation in 
gcc/testsuite/gcc.target/loongarch/imm-load1.c:

long long r = 0x0101010101010101;

Before this patch:

lu12i.w $r15,16842752>>12
ori $r15,$r15,257
lu32i.d $r15,0x10101>>32
lu52i.d $r15,$r15,0x100>>52

After this patch:

lu12i.w $r15,16842752>>12
ori $r15,$r15,257
bstrins.d   $r15,$r15,63,32

gcc/ChangeLog:

* config/loongarch/loongarch.cc (enum loongarch_load_imm_method): Add 
new method.
(loongarch_build_integer): Add relevant implementations for new method.
(loongarch_move_integer): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/imm-load1.c: Change old check.
---
 gcc/config/loongarch/loongarch.cc | 22 ++-
 .../gcc.target/loongarch/imm-load1.c  |  3 ++-
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index d05743bec87..58c00344d09 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -142,12 +142,16 @@ struct loongarch_address_info
 
METHOD_LU52I:
  Load 52-63 bit of the immediate number.
+
+   METHOD_MIRROR:
+ Copy 0-31 bit of the immediate number to 32-63bit.
 */
 enum loongarch_load_imm_method
 {
   METHOD_NORMAL,
   METHOD_LU32I,
-  METHOD_LU52I
+  METHOD_LU52I,
+  METHOD_MIRROR
 };
 
 struct loongarch_integer_op
@@ -1556,11 +1560,23 @@ loongarch_build_integer (struct loongarch_integer_op 
*codes,
 
   int sign31 = (value & (HOST_WIDE_INT_1U << 31)) >> 31;
   int sign51 = (value & (HOST_WIDE_INT_1U << 51)) >> 51;
+
+  unsigned HOST_WIDE_INT hival = value >> 32;
+  unsigned HOST_WIDE_INT loval = value << 32 >> 32;
+
   /* Determine whether the upper 32 bits are sign-extended from the lower
 32 bits. If it is, the instructions to load the high order can be
 ommitted.  */
   if (lu32i[sign31] && lu52i[sign31])
return cost;
+  /* If the lower 32 bits are the same as the upper 32 bits, just copy
+the lower 32 bits to the upper 32 bits.  */
+  else if (loval == hival)
+   {
+ codes[cost].method = METHOD_MIRROR;
+ codes[cost].curr_value = value;
+ return cost + 1;
+   }
   /* Determine whether bits 32-51 are sign-extended from the lower 32
 bits. If so, directly load 52-63 bits.  */
   else if (lu32i[sign31])
@@ -3230,6 +3246,10 @@ loongarch_move_integer (rtx temp, rtx dest, unsigned 
HOST_WIDE_INT value)
   gen_rtx_AND (DImode, x, GEN_INT (0xf)),
   GEN_INT (codes[i].value));
  break;
+   case METHOD_MIRROR:
+ gcc_assert (mode == DImode);
+ emit_insn (gen_insvdi (x, GEN_INT (32), GEN_INT (32), x));
+ break;
default:
  gcc_unreachable ();
}
diff --git a/gcc/testsuite/gcc.target/loongarch/imm-load1.c 
b/gcc/testsuite/gcc.target/loongarch/imm-load1.c
index 2ff02971239..f64cc2956a3 100644
--- a/gcc/testsuite/gcc.target/loongarch/imm-load1.c
+++ b/gcc/testsuite/gcc.target/loongarch/imm-load1.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options "-mabi=lp64d -O2" } */
-/* { dg-final { scan-assembler "test:.*lu52i\.d.*\n\taddi\.w.*\n\.L2:" } } */
+/* { dg-final { scan-assembler-not "test:.*lu52i\.d.*\n\taddi\.w.*\n\.L2:" } } 
*/
+/* { dg-final { scan-assembler "test:.*lu12i\.w.*\n\tbstrins\.d.*\n\.L2:" } } 
*/
 
 
 extern long long b[10];
-- 
2.20.1



Re: [PATCH] LoongArch: Optimize the loading of immediate numbers with the same high and low 32-bit values

2023-11-20 Thread Guo Jie

Thanks for your advice! I will fix it in patch v2.


在 2023/11/18 下午5:09, Xi Ruoyao 写道:

On Sat, 2023-11-18 at 14:59 +0800, Guo Jie wrote:

For the following immediate load operation in 
gcc/testsuite/gcc.target/loongarch/imm-load1.c:

long long r = 0x0101010101010101;

Before this patch:

lu12i.w     $r15,16842752>>12
ori     $r15,$r15,257
lu32i.d     $r15,0x10101>>32
lu52i.d     $r15,$r15,0x100>>52

After this patch:

lu12i.w $r15,16842752>>12
ori $r15,$r15,257
bstrins.d   $r15,$r15,63,32

gcc/ChangeLog:

* config/loongarch/loongarch.cc (enum loongarch_load_imm_method): Add 
new method.
(loongarch_build_integer): Add relevant implementations for new method.
(loongarch_move_integer): Ditto.

IIRC the ChangeLog line should be wrapped at 72 characters.

/* snip */


  struct loongarch_integer_op
@@ -1556,11 +1560,23 @@ loongarch_build_integer (struct loongarch_integer_op 
*codes,
  
    int sign31 = (value & (HOST_WIDE_INT_1U << 31)) >> 31;

    int sign51 = (value & (HOST_WIDE_INT_1U << 51)) >> 51;
+
+  unsigned HOST_WIDE_INT hival = value >> 32;
+  unsigned HOST_WIDE_INT loval = value << 32 >> 32;

Use

uint32_t hival = (uint32_t) (value >> 32);
uint32_t loval = (uint32_t) value;

instead, because "value << 32" may trigger a left-shift of negative
value.

C++11 doesn't allow shifting left any negative value.  Yes it's allowed
as a GCC extension and it's also allowed by C++23, but GCC codebase is
still C++11.  So it may break GCC if bootstrapping from a different
compiler, and --with-build-config=bootstrap-ubsan will complain.

Otherwise LGTM.





[PATCH v2] LoongArch: Optimize the loading of immediate numbers with the same high and low 32-bit values

2023-11-22 Thread Guo Jie
For the following immediate load operation in 
gcc/testsuite/gcc.target/loongarch/imm-load1.c:

long long r = 0x0101010101010101;

Before this patch:

lu12i.w $r15,16842752>>12
ori $r15,$r15,257
lu32i.d $r15,0x10101>>32
lu52i.d $r15,$r15,0x100>>52

After this patch:

lu12i.w $r15,16842752>>12
ori $r15,$r15,257
bstrins.d   $r15,$r15,63,32

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(enum loongarch_load_imm_method): Add new method.
(loongarch_build_integer): Add relevant implementations for
new method.
(loongarch_move_integer): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/imm-load1.c: Change old check.

---
Update in v2:
1. Correct the format of ChangeLog.
2. Avoid left shift of negative value in loongarch_build_integer.

---
 gcc/config/loongarch/loongarch.cc | 22 ++-
 .../gcc.target/loongarch/imm-load1.c  |  3 ++-
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index d05743bec87..f95507e2348 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -142,12 +142,16 @@ struct loongarch_address_info
 
METHOD_LU52I:
  Load 52-63 bit of the immediate number.
+
+   METHOD_MIRROR:
+ Copy 0-31 bit of the immediate number to 32-63bit.
 */
 enum loongarch_load_imm_method
 {
   METHOD_NORMAL,
   METHOD_LU32I,
-  METHOD_LU52I
+  METHOD_LU52I,
+  METHOD_MIRROR
 };
 
 struct loongarch_integer_op
@@ -1556,11 +1560,23 @@ loongarch_build_integer (struct loongarch_integer_op 
*codes,
 
   int sign31 = (value & (HOST_WIDE_INT_1U << 31)) >> 31;
   int sign51 = (value & (HOST_WIDE_INT_1U << 51)) >> 51;
+
+  uint32_t hival = (uint32_t) (value >> 32);
+  uint32_t loval = (uint32_t) value;
+
   /* Determine whether the upper 32 bits are sign-extended from the lower
 32 bits. If it is, the instructions to load the high order can be
 ommitted.  */
   if (lu32i[sign31] && lu52i[sign31])
return cost;
+  /* If the lower 32 bits are the same as the upper 32 bits, just copy
+the lower 32 bits to the upper 32 bits.  */
+  else if (loval == hival)
+   {
+ codes[cost].method = METHOD_MIRROR;
+ codes[cost].curr_value = value;
+ return cost + 1;
+   }
   /* Determine whether bits 32-51 are sign-extended from the lower 32
 bits. If so, directly load 52-63 bits.  */
   else if (lu32i[sign31])
@@ -3230,6 +3246,10 @@ loongarch_move_integer (rtx temp, rtx dest, unsigned 
HOST_WIDE_INT value)
   gen_rtx_AND (DImode, x, GEN_INT (0xf)),
   GEN_INT (codes[i].value));
  break;
+   case METHOD_MIRROR:
+ gcc_assert (mode == DImode);
+ emit_insn (gen_insvdi (x, GEN_INT (32), GEN_INT (32), x));
+ break;
default:
  gcc_unreachable ();
}
diff --git a/gcc/testsuite/gcc.target/loongarch/imm-load1.c 
b/gcc/testsuite/gcc.target/loongarch/imm-load1.c
index 2ff02971239..f64cc2956a3 100644
--- a/gcc/testsuite/gcc.target/loongarch/imm-load1.c
+++ b/gcc/testsuite/gcc.target/loongarch/imm-load1.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options "-mabi=lp64d -O2" } */
-/* { dg-final { scan-assembler "test:.*lu52i\.d.*\n\taddi\.w.*\n\.L2:" } } */
+/* { dg-final { scan-assembler-not "test:.*lu52i\.d.*\n\taddi\.w.*\n\.L2:" } } 
*/
+/* { dg-final { scan-assembler "test:.*lu12i\.w.*\n\tbstrins\.d.*\n\.L2:" } } 
*/
 
 
 extern long long b[10];
-- 
2.36.0



[PATCH] LoongArch: Fix runtime error in a gcc build with --with-build-config=bootstrap-ubsan

2023-11-22 Thread Guo Jie
gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_split_plus_constant):
avoid left shift of negative value -0x8000.

---
 gcc/config/loongarch/loongarch.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 33357c670e1..81cd9fa1e7c 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -4249,7 +4249,7 @@ loongarch_split_plus_constant (rtx *op, machine_mode mode)
   else if (loongarch_addu16i_imm12_operand_p (v, mode))
 a = (v & ~HWIT_UC_0xFFF) + ((v & 0x800) << 1);
   else if (mode == DImode && DUAL_ADDU16I_OPERAND (v))
-a = (v > 0 ? 0x7fff : -0x8000) << 16;
+a = (v > 0 ? 0x7fff : ~0x7fff);
   else
 gcc_unreachable ();
 
-- 
2.20.1



[PATCH] LoongArch: Optimizations of vector construction.

2023-09-20 Thread Guo Jie
Change-Id: I327f68ab482b94073974e672c71d25c98b35a080

gcc/ChangeLog:

* config/loongarch/lasx.md (lasx_vecinit_merge_): New
pattern for vector construction.
(vec_set_internal): Ditto.
(lasx_xvinsgr2vr__internal): Ditto.
(lasx_xvilvl__internal): Ditto.
* config/loongarch/loongarch.cc (loongarch_expand_vector_init):
Optimized the implementation of vector construction.
(loongarch_expand_vector_init_same): New function.
* config/loongarch/lsx.md (lsx_vilvl__internal): New
pattern for vector construction.
(lsx_vreplvei_mirror_): New pattern for vector
construction.
(vec_concatv2df): Ditto.
(vec_concatv4sf): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/lasx/lasx-vec-construct-opt.c: New test.
* gcc.target/loongarch/vector/lsx/lsx-vec-construct-opt.c: New test.
---
 gcc/config/loongarch/lasx.md  |  69 ++
 gcc/config/loongarch/loongarch.cc | 716 +-
 gcc/config/loongarch/lsx.md   | 134 
 .../vector/lasx/lasx-vec-construct-opt.c  | 102 +++
 .../vector/lsx/lsx-vec-construct-opt.c|  85 +++
 5 files changed, 732 insertions(+), 374 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-vec-construct-opt.c
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-vec-construct-opt.c

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 8111c8bb79a..2bc5d47ed4a 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -186,6 +186,9 @@ (define_c_enum "unspec" [
   UNSPEC_LASX_XVLDI
   UNSPEC_LASX_XVLDX
   UNSPEC_LASX_XVSTX
+  UNSPEC_LASX_VECINIT_MERGE
+  UNSPEC_LASX_VEC_SET_INTERNAL
+  UNSPEC_LASX_XVILVL_INTERNAL
 ])
 
 ;; All vector modes with 256 bits.
@@ -255,6 +258,15 @@ (define_mode_attr VFHMODE256
[(V8SF "V4SF")
(V4DF "V2DF")])
 
+;; The attribute gives half int/float modes for vector modes.
+(define_mode_attr VHMODE256_ALL
+  [(V32QI "V16QI")
+   (V16HI "V8HI")
+   (V8SI "V4SI")
+   (V4DI "V2DI")
+   (V8SF "V4SF")
+   (V4DF "V2DF")])
+
 ;; The attribute gives double modes for vector modes in LASX.
 (define_mode_attr VDMODE256
   [(V8SI "V4DI")
@@ -312,6 +324,11 @@ (define_mode_attr mode256_f
(V4DI "v4df")
(V8SI "v8sf")])
 
+;; This attribute gives V32QI mode and V16HI mode with half size.
+(define_mode_attr mode256_i_half
+  [(V32QI "v16qi")
+   (V16HI "v8hi")])
+
  ;; This attribute gives suffix for LASX instructions.  HOW?
 (define_mode_attr lasxfmt
   [(V4DF "d")
@@ -756,6 +773,20 @@ (define_insn "lasx_xvpermi_q_"
   [(set_attr "type" "simd_splat")
(set_attr "mode" "")])
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Support a LSX-mode input op2.
+(define_insn "lasx_vecinit_merge_"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+   (unspec:LASX
+ [(match_operand:LASX 1 "register_operand" "0")
+  (match_operand: 2 "register_operand" "f")
+  (match_operand 3 "const_uimm8_operand")]
+  UNSPEC_LASX_VECINIT_MERGE))]
+  "ISA_HAS_LASX"
+  "xvpermi.q\t%u0,%u2,%3"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "")])
+
 (define_insn "lasx_xvpickve2gr_d"
   [(set (match_operand:DI 0 "register_operand" "=r")
(any_extend:DI
@@ -779,6 +810,33 @@ (define_expand "vec_set"
   DONE;
 })
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Simulate missing instructions xvinsgr2vr.b and xvinsgr2vr.h.
+(define_expand "vec_set_internal"
+  [(match_operand:ILASX_HB 0 "register_operand")
+   (match_operand: 1 "reg_or_0_operand")
+   (match_operand 2 "const__operand")]
+  "ISA_HAS_LASX"
+{
+  rtx index = GEN_INT (1 << INTVAL (operands[2]));
+  emit_insn (gen_lasx_xvinsgr2vr__internal
+(operands[0], operands[1], operands[0], index));
+  DONE;
+})
+
+(define_insn "lasx_xvinsgr2vr__internal"
+  [(set (match_operand:ILASX_HB 0 "register_operand" "=f")
+   (unspec:ILASX_HB [(match_operand: 1 "reg_or_0_operand" "rJ")
+ (match_operand:ILASX_HB 2 "register_operand" "0")
+ (match_operand 3 "const__operand" "")]
+UNSPEC_LASX_VEC_SET_INTERNAL))]
+  "ISA_HAS_LASX"
+{
+  return "vinsgr2vr.\t%w0,%z1,%y3";
+}
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "")])
+
 (define_expand "vec_set"
   [(match_operand:FLASX 0 "register_operand")
(match_operand: 1 "reg_or_0_operand")
@@ -1567,6 +1625,17 @@ (define_insn "logb2"
   [(set_attr "type" "simd_flog2")
(set_attr "mode" "")])
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Merge two scalar floating-point op1 and op2 into a LASX op0.
+(define_insn "lasx_xvilvl__internal"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+   (unspec:FLASX [(match_operand: 1 "register_operand" "f")
+  (match_operand: 2 "register_operand" "f")]
+   

[PATCH] LoongArch: Optimizations of vector construction.

2023-09-20 Thread Guo Jie
gcc/ChangeLog:

* config/loongarch/lasx.md (lasx_vecinit_merge_): New
pattern for vector construction.
(vec_set_internal): Ditto.
(lasx_xvinsgr2vr__internal): Ditto.
(lasx_xvilvl__internal): Ditto.
* config/loongarch/loongarch.cc (loongarch_expand_vector_init):
Optimized the implementation of vector construction.
(loongarch_expand_vector_init_same): New function.
* config/loongarch/lsx.md (lsx_vilvl__internal): New
pattern for vector construction.
(lsx_vreplvei_mirror_): New pattern for vector
construction.
(vec_concatv2df): Ditto.
(vec_concatv4sf): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vector/lasx/lasx-vec-construct-opt.c: New test.
* gcc.target/loongarch/vector/lsx/lsx-vec-construct-opt.c: New test.
---
 gcc/config/loongarch/lasx.md  |  69 ++
 gcc/config/loongarch/loongarch.cc | 716 +-
 gcc/config/loongarch/lsx.md   | 134 
 .../vector/lasx/lasx-vec-construct-opt.c  | 102 +++
 .../vector/lsx/lsx-vec-construct-opt.c|  85 +++
 5 files changed, 732 insertions(+), 374 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-vec-construct-opt.c
 create mode 100644 
gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-vec-construct-opt.c

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 8111c8bb79a..2bc5d47ed4a 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -186,6 +186,9 @@ (define_c_enum "unspec" [
   UNSPEC_LASX_XVLDI
   UNSPEC_LASX_XVLDX
   UNSPEC_LASX_XVSTX
+  UNSPEC_LASX_VECINIT_MERGE
+  UNSPEC_LASX_VEC_SET_INTERNAL
+  UNSPEC_LASX_XVILVL_INTERNAL
 ])
 
 ;; All vector modes with 256 bits.
@@ -255,6 +258,15 @@ (define_mode_attr VFHMODE256
[(V8SF "V4SF")
(V4DF "V2DF")])
 
+;; The attribute gives half int/float modes for vector modes.
+(define_mode_attr VHMODE256_ALL
+  [(V32QI "V16QI")
+   (V16HI "V8HI")
+   (V8SI "V4SI")
+   (V4DI "V2DI")
+   (V8SF "V4SF")
+   (V4DF "V2DF")])
+
 ;; The attribute gives double modes for vector modes in LASX.
 (define_mode_attr VDMODE256
   [(V8SI "V4DI")
@@ -312,6 +324,11 @@ (define_mode_attr mode256_f
(V4DI "v4df")
(V8SI "v8sf")])
 
+;; This attribute gives V32QI mode and V16HI mode with half size.
+(define_mode_attr mode256_i_half
+  [(V32QI "v16qi")
+   (V16HI "v8hi")])
+
  ;; This attribute gives suffix for LASX instructions.  HOW?
 (define_mode_attr lasxfmt
   [(V4DF "d")
@@ -756,6 +773,20 @@ (define_insn "lasx_xvpermi_q_"
   [(set_attr "type" "simd_splat")
(set_attr "mode" "")])
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Support a LSX-mode input op2.
+(define_insn "lasx_vecinit_merge_"
+  [(set (match_operand:LASX 0 "register_operand" "=f")
+   (unspec:LASX
+ [(match_operand:LASX 1 "register_operand" "0")
+  (match_operand: 2 "register_operand" "f")
+  (match_operand 3 "const_uimm8_operand")]
+  UNSPEC_LASX_VECINIT_MERGE))]
+  "ISA_HAS_LASX"
+  "xvpermi.q\t%u0,%u2,%3"
+  [(set_attr "type" "simd_splat")
+   (set_attr "mode" "")])
+
 (define_insn "lasx_xvpickve2gr_d"
   [(set (match_operand:DI 0 "register_operand" "=r")
(any_extend:DI
@@ -779,6 +810,33 @@ (define_expand "vec_set"
   DONE;
 })
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Simulate missing instructions xvinsgr2vr.b and xvinsgr2vr.h.
+(define_expand "vec_set_internal"
+  [(match_operand:ILASX_HB 0 "register_operand")
+   (match_operand: 1 "reg_or_0_operand")
+   (match_operand 2 "const__operand")]
+  "ISA_HAS_LASX"
+{
+  rtx index = GEN_INT (1 << INTVAL (operands[2]));
+  emit_insn (gen_lasx_xvinsgr2vr__internal
+(operands[0], operands[1], operands[0], index));
+  DONE;
+})
+
+(define_insn "lasx_xvinsgr2vr__internal"
+  [(set (match_operand:ILASX_HB 0 "register_operand" "=f")
+   (unspec:ILASX_HB [(match_operand: 1 "reg_or_0_operand" "rJ")
+ (match_operand:ILASX_HB 2 "register_operand" "0")
+ (match_operand 3 "const__operand" "")]
+UNSPEC_LASX_VEC_SET_INTERNAL))]
+  "ISA_HAS_LASX"
+{
+  return "vinsgr2vr.\t%w0,%z1,%y3";
+}
+  [(set_attr "type" "simd_insert")
+   (set_attr "mode" "")])
+
 (define_expand "vec_set"
   [(match_operand:FLASX 0 "register_operand")
(match_operand: 1 "reg_or_0_operand")
@@ -1567,6 +1625,17 @@ (define_insn "logb2"
   [(set_attr "type" "simd_flog2")
(set_attr "mode" "")])
 
+;; Only for loongarch_expand_vector_init in loongarch.cc.
+;; Merge two scalar floating-point op1 and op2 into a LASX op0.
+(define_insn "lasx_xvilvl__internal"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+   (unspec:FLASX [(match_operand: 1 "register_operand" "f")
+  (match_operand: 2 "register_operand" "f")]
+ UNSPEC_LASX_XVILVL_INTERNAL))]
+  "ISA_H

Re: [PATCH] LoongArch: Enable shrink wrapping

2023-04-25 Thread Guo Jie

/* snip */

  diff --git a/gcc/testsuite/gcc.target/loongarch/shrink-wrap.c 
b/gcc/testsuite/gcc.target/loongarch/shrink-wrap.c

new file mode 100644
index 000..f2c867a2769
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/shrink-wrap.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fshrink-wrap" } */
+
+/* f(x) should do nothing if x is 0.  */
+/* { dg-final { scan-assembler "bnez\t\\\$r4,\[^\n\]*\n\tjr\t\\\$r1" 
} } */

+
+void g(void);
+
+void
+f(int x)
+{
+  if (x)
+    {
+  register int s0 asm("s0") = x;
+  register int s1 asm("s1") = x;
+  register int s2 asm("s2") = x;
+  asm("" : : "r"(s0));
+  asm("" : : "r"(s1));
+  asm("" : : "r"(s2));
+  g();
+    }
+}


I think the test case cannot fully reflect the optimization effect of 
the current patch,


because even without the patch, -O -fshrink-wrap will still perform 
architecture independent optimization.


This patch considers architecture related registers as finer grained 
optimization for shrink wrapping,


I think a test case like the one below is more suitable:


int foo(int x)
{
  if (x)
  {
    __asm__ ("":::"s0","s1");
    return x;
  }

  __asm__ ("":::"s2","s3");
  return 0;
}

Otherwise LGTM, thanks!



[PATCH] LoongArch: Remove useless UNSPECs and define_mode_attrs

2024-12-29 Thread Guo Jie
gcc/ChangeLog:

* config/loongarch/lasx.md: Remove useless code.
* config/loongarch/lsx.md: Ditto.

---
 gcc/config/loongarch/lasx.md | 66 
 gcc/config/loongarch/lsx.md  | 35 ---
 2 files changed, 101 deletions(-)

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 071a5cb1733..8afd0ffd7c5 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -37,16 +37,12 @@ (define_c_enum "unspec" [
   UNSPEC_LASX_XVFCVTH
   UNSPEC_LASX_XVFCVTL
   UNSPEC_LASX_XVFLOGB
-  UNSPEC_LASX_XVFRECIP
   UNSPEC_LASX_XVFRECIPE
-  UNSPEC_LASX_XVFRINT
   UNSPEC_LASX_XVFRSQRT
   UNSPEC_LASX_XVFRSQRTE
   UNSPEC_LASX_XVFTINT_U
-  UNSPEC_LASX_XVCLO
   UNSPEC_LASX_XVSAT_S
   UNSPEC_LASX_XVSAT_U
-  UNSPEC_LASX_XVREPLVE0
   UNSPEC_LASX_XVREPL128VEI
   UNSPEC_LASX_XVSRAR
   UNSPEC_LASX_XVSRARI
@@ -57,7 +53,6 @@ (define_c_enum "unspec" [
   UNSPEC_LASX_BRANCH
   UNSPEC_LASX_BRANCH_V
 
-  UNSPEC_LASX_MXVEXTW_U
   UNSPEC_LASX_XVSLLWIL_S
   UNSPEC_LASX_XVSLLWIL_U
   UNSPEC_LASX_XVSRAN
@@ -130,7 +125,6 @@ (define_c_enum "unspec" [
   UNSPEC_LASX_XVADD_Q
   UNSPEC_LASX_XVSUB_Q
   UNSPEC_LASX_XVREPLVE
-  UNSPEC_LASX_XVSHUF4
   UNSPEC_LASX_XVMSKGEZ
   UNSPEC_LASX_XVMSKNZ
   UNSPEC_LASX_XVEXTH_Q_D
@@ -212,11 +206,6 @@ (define_mode_attr VHMODE256
(V8SI "V4SI")
(V4DI "V2DI")])
 
-;;attribute gives half float modes for vector modes.
-(define_mode_attr VFHMODE256
-   [(V8SF "V4SF")
-   (V4DF "V2DF")])
-
 ;; The attribute gives half int/float modes for vector modes.
 (define_mode_attr VHMODE256_ALL
   [(V32QI "V16QI")
@@ -252,20 +241,6 @@ (define_mode_attr VEMODE256
(V4DF "V8DF")
(V4DI "V8DI")])
 
-;; This attribute gives the mode of the result for "copy_s_b, copy_u_b" etc.
-(define_mode_attr VRES256
-  [(V4DF "DF")
-   (V8SF "SF")
-   (V4DI "DI")
-   (V8SI "SI")
-   (V16HI "SI")
-   (V32QI "SI")])
-
-;; Only used with LASX_D iterator.
-(define_mode_attr lasx_d
-  [(V4DI "reg_or_0")
-   (V4DF "register")])
-
 ;; This attribute gives the 256 bit integer vector mode with same size.
 (define_mode_attr mode256_i
   [(V4DF "v4di")
@@ -275,14 +250,6 @@ (define_mode_attr mode256_i
(V16HI "v16hi")
(V32QI "v32qi")])
 
-
-;; This attribute gives the 256 bit float vector mode with same size.
-(define_mode_attr mode256_f
-  [(V4DF "v4df")
-   (V8SF "v8sf")
-   (V4DI "v4df")
-   (V8SI "v8sf")])
-
 ;; This attribute gives V32QI mode and V16HI mode with half size.
 (define_mode_attr mode256_i_half
   [(V32QI "v16qi")
@@ -344,14 +311,6 @@ (define_mode_attr lasxfmt_f
(V16HI "h")
(V32QI "b")])
 
-(define_mode_attr flasxfmt_f
-  [(V4DF "d_f")
-   (V8SF "s_f")
-   (V4DI "d")
-   (V8SI "w")
-   (V16HI "h")
-   (V32QI "b")])
-
 ;; This attribute gives define_insn suffix for LASX instructions that need
 ;; distinction between integer and floating point.
 (define_mode_attr lasxfmt_f_wd
@@ -438,27 +397,6 @@ (define_mode_attr bitimm256
(V4DI  "uimm6")])
 
 
-(define_mode_attr d2lasxfmt
-  [(V8SI "q")
-   (V16HI "d")
-   (V32QI "w")])
-
-(define_mode_attr d2lasxfmt_u
-  [(V8SI "qu")
-   (V16HI "du")
-   (V32QI "wu")])
-
-(define_mode_attr VD2MODE256
-  [(V8SI "V4DI")
-   (V16HI "V4DI")
-   (V32QI "V8SI")])
-
-(define_mode_attr lasxfmt_wd
-  [(V4DI "d")
-   (V8SI "w")
-   (V16HI "w")
-   (V32QI "w")])
-
 ;; Half modes of all LASX vector modes, in lower-case.
 (define_mode_attr lasxhalf [(V32QI "v16qi")  (V16HI "v8hi")
  (V8SI "v4si")  (V4DI  "v2di")
@@ -1402,10 +1340,6 @@ (define_insn "floatuns2"
(set_attr "cnv_mode" "")
(set_attr "mode" "")])
 
-(define_mode_attr FFQ256
-  [(V4SF "V16HI")
-   (V2DF "V8SI")])
-
 (define_insn "lasx_xvreplgr2vr_"
   [(set (match_operand:ILASX 0 "register_operand" "=f,f")
(vec_duplicate:ILASX
diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md
index 878ff11e1ac..6c92e69d235 100644
--- a/gcc/config/loongarch/lsx.md
+++ b/gcc/config/loongarch/lsx.md
@@ -39,15 +39,12 @@ (define_c_enum "unspec" [
   UNSPEC_LSX_VFCVTH
   UNSPEC_LSX_VFCVTL
   UNSPEC_LSX_VFLOGB
-  UNSPEC_LSX_VFRECIP
   UNSPEC_LSX_VFRECIPE
-  UNSPEC_LSX_VFRINT
   UNSPEC_LSX_VFRSQRT
   UNSPEC_LSX_VFRSQRTE
   UNSPEC_LSX_VFTINT_U
   UNSPEC_LSX_VSAT_S
   UNSPEC_LSX_VSAT_U
-  UNSPEC_LSX_VREPLVEI
   UNSPEC_LSX_VSRAR
   UNSPEC_LSX_VSRARI
   UNSPEC_LSX_VSRLR
@@ -167,22 +164,6 @@ (define_mode_attr dlsxfmt_u
(V8HI "wu")
(V16QI "hu")])
 
-(define_mode_attr d2lsxfmt
-  [(V4SI "q")
-   (V8HI "d")
-   (V16QI "w")])
-
-(define_mode_attr d2lsxfmt_u
-  [(V4SI "qu")
-   (V8HI "du")
-   (V16QI "wu")])
-
-;; The attribute gives two double modes for vector modes.
-(define_mode_attr VD2MODE
-  [(V4SI "V2DI")
-   (V8HI "V2DI")
-   (V16QI "V4SI")])
-
 ;; Only used for vilvh and splitting insert_d and copy_{u,s}.d.
 (define_mode_iterator LSX_D[V2DI V2DF])
 
@@ -299,24 +280,12 @@ (define_mode_attr lsxfmt_f
(V8HI "h")
(V16QI "b")])
 
-(define_mode_attr flsxfmt_f
-  [(V2DF "d_f")
-   (V4SF "s_f")
-   

[PATCH] LoongArch: Fix bugs in insn patterns lasx_xvrepl128vei_b/h/w/d_internal

2024-12-29 Thread Guo Jie
There are two aspects that affect the matching of instruction templates:

1. vec_duplicate is redundant in the following operations.
set (match_operand:V4DI ...)
(vec_duplicate:V4DI (vec_select:V4DI ...))

2. The range of values for testing predicate const_8_to_15_operand and
const_16_to_31_operand should be [8, 15] and [16, 31] respectively.

However, there is currently no suitable testcase to verify.

gcc/ChangeLog:

* config/loongarch/lasx.md: Remove useless vec_select.
* config/loongarch/predicates.md: Correct error predicate.

---
 gcc/config/loongarch/lasx.md   | 76 ++
 gcc/config/loongarch/predicates.md |  4 +-
 2 files changed, 38 insertions(+), 42 deletions(-)

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 071a5cb1733..039b23795be 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -2347,21 +2347,20 @@ (define_insn "lasx_xvreplve0_"
 
 (define_insn "lasx_xvrepl128vei_b_internal"
   [(set (match_operand:V32QI 0 "register_operand" "=f")
-   (vec_duplicate:V32QI
- (vec_select:V32QI
-   (match_operand:V32QI 1 "register_operand" "f")
-   (parallel [(match_operand 2 "const_uimm4_operand" "")
-  (match_dup 2) (match_dup 2) (match_dup 2)
-  (match_dup 2) (match_dup 2) (match_dup 2)
-  (match_dup 2) (match_dup 2) (match_dup 2)
-  (match_dup 2) (match_dup 2) (match_dup 2)
-  (match_dup 2) (match_dup 2) (match_dup 2)
-  (match_operand 3 "const_16_to_31_operand" "")
-  (match_dup 3) (match_dup 3) (match_dup 3)
-  (match_dup 3) (match_dup 3) (match_dup 3)
-  (match_dup 3) (match_dup 3) (match_dup 3)
-  (match_dup 3) (match_dup 3) (match_dup 3)
-  (match_dup 3) (match_dup 3) (match_dup 3)]]
+   (vec_select:V32QI
+ (match_operand:V32QI 1 "register_operand" "f")
+ (parallel [(match_operand 2 "const_uimm4_operand" "")
+(match_dup 2) (match_dup 2) (match_dup 2)
+(match_dup 2) (match_dup 2) (match_dup 2)
+(match_dup 2) (match_dup 2) (match_dup 2)
+(match_dup 2) (match_dup 2) (match_dup 2)
+(match_dup 2) (match_dup 2) (match_dup 2)
+(match_operand 3 "const_16_to_31_operand" "")
+(match_dup 3) (match_dup 3) (match_dup 3)
+(match_dup 3) (match_dup 3) (match_dup 3)
+(match_dup 3) (match_dup 3) (match_dup 3)
+(match_dup 3) (match_dup 3) (match_dup 3)
+(match_dup 3) (match_dup 3) (match_dup 3)])))]
   "ISA_HAS_LASX && ((INTVAL (operands[3]) - INTVAL (operands[2])) == 16)"
   "xvrepl128vei.b\t%u0,%u1,%2"
   [(set_attr "type" "simd_splat")
@@ -2369,17 +2368,16 @@ (define_insn "lasx_xvrepl128vei_b_internal"
 
 (define_insn "lasx_xvrepl128vei_h_internal"
   [(set (match_operand:V16HI 0 "register_operand" "=f")
-   (vec_duplicate:V16HI
- (vec_select:V16HI
-   (match_operand:V16HI 1 "register_operand" "f")
-   (parallel [(match_operand 2 "const_uimm3_operand" "")
-  (match_dup 2) (match_dup 2) (match_dup 2)
-  (match_dup 2) (match_dup 2) (match_dup 2)
-  (match_dup 2)
-  (match_operand 3 "const_8_to_15_operand" "")
-  (match_dup 3) (match_dup 3) (match_dup 3)
-  (match_dup 3) (match_dup 3) (match_dup 3)
-  (match_dup 3)]]
+   (vec_select:V16HI
+ (match_operand:V16HI 1 "register_operand" "f")
+ (parallel [(match_operand 2 "const_uimm3_operand" "")
+(match_dup 2) (match_dup 2) (match_dup 2)
+(match_dup 2) (match_dup 2) (match_dup 2)
+(match_dup 2)
+(match_operand 3 "const_8_to_15_operand" "")
+(match_dup 3) (match_dup 3) (match_dup 3)
+(match_dup 3) (match_dup 3) (match_dup 3)
+(match_dup 3)])))]
   "ISA_HAS_LASX && ((INTVAL (operands[3]) - INTVAL (operands[2])) == 8)"
   "xvrepl128vei.h\t%u0,%u1,%2"
   [(set_attr "type" "simd_splat")
@@ -2387,13 +2385,12 @@ (define_insn "lasx_xvrepl128vei_h_internal"
 
 (define_insn "lasx_xvrepl128vei_w_internal"
   [(set (match_operand:V8SI 0 "register_operand" "=f")
-   (vec_duplicate:V8SI
- (vec_select:V8SI
-   (match_operand:V8SI 1 "register_operand" "f")
-   (parallel [(match_operand 2 "const_0_to_3_operand" "")
-  (match_dup 2) (match_dup 2) (match_dup 2)
-  (match_operand 3 "const_4_to_7_operand" "")
-  (match_dup 3) (match_dup 3) (match_dup 3)]]
+   (vec_select:V8SI

[PATCH] LoongArch: Fix selector error in lasx_xvexth_h/w/d* patterns

2024-12-29 Thread Guo Jie
The xvexth related instructions operate SEPARATELY according to
the high and low 128 bits, and sign/zero extend the upper half
of every 128 bits in src to the corresponding 128 bits in dest.

For xvexth.d.w, the rule for the first element of dest should be:
dest.D[0] = sign_extend (src.W[2] ,64);
instead of:
dest.D[0] = sign_extend (src.W[4] ,64);

gcc/ChangeLog:

* config/loongarch/lasx.md: Fix selector index.

---
 gcc/config/loongarch/lasx.md | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 071a5cb1733..7d3c035eef4 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -4249,10 +4249,10 @@ (define_insn "lasx_xvexth_h_b"
(any_extend:V16HI
  (vec_select:V16QI
(match_operand:V32QI 1 "register_operand" "f")
- (parallel [(const_int 16) (const_int 17)
-(const_int 18) (const_int 19)
-(const_int 20) (const_int 21)
-(const_int 22) (const_int 23)
+ (parallel [(const_int 8) (const_int 9)
+(const_int 10) (const_int 11)
+(const_int 12) (const_int 13)
+(const_int 14) (const_int 15)
 (const_int 24) (const_int 25)
 (const_int 26) (const_int 27)
 (const_int 28) (const_int 29)
@@ -4267,8 +4267,8 @@ (define_insn "lasx_xvexth_w_h"
(any_extend:V8SI
  (vec_select:V8HI
(match_operand:V16HI 1 "register_operand" "f")
-   (parallel [(const_int 8) (const_int 9)
-  (const_int 10) (const_int 11)
+   (parallel [(const_int 4) (const_int 5)
+  (const_int 6) (const_int 7)
   (const_int 12) (const_int 13)
   (const_int 14) (const_int 15)]]
   "ISA_HAS_LASX"
@@ -4281,7 +4281,7 @@ (define_insn "lasx_xvexth_d_w"
(any_extend:V4DI
  (vec_select:V4SI
(match_operand:V8SI 1 "register_operand" "f")
-   (parallel [(const_int 4) (const_int 5)
+   (parallel [(const_int 2) (const_int 3)
   (const_int 6) (const_int 7)]]
   "ISA_HAS_LASX"
   "xvexth.d.w\t%u0,%u1"
-- 
2.20.1



[PATCH] LoongArch: Add standard patterns uabd and sabd

2024-12-29 Thread Guo Jie
gcc/ChangeLog:

* config/loongarch/lasx.md (lasx_xvabsd_s_): Remove.
(abd3): New insn pattern.
(lasx_xvabsd_u_): Remove.
* config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vabsd_b):
Rename.
(CODE_FOR_lsx_vabsd_h): Ditto.
(CODE_FOR_lsx_vabsd_w): Ditto.
(CODE_FOR_lsx_vabsd_d): Ditto.
(CODE_FOR_lsx_vabsd_bu): Ditto.
(CODE_FOR_lsx_vabsd_hu): Ditto.
(CODE_FOR_lsx_vabsd_wu): Ditto.
(CODE_FOR_lsx_vabsd_du): Ditto.
(CODE_FOR_lasx_xvabsd_b): Ditto.
(CODE_FOR_lasx_xvabsd_h): Ditto.
(CODE_FOR_lasx_xvabsd_w): Ditto.
(CODE_FOR_lasx_xvabsd_d): Ditto.
(CODE_FOR_lasx_xvabsd_bu): Ditto.
(CODE_FOR_lasx_xvabsd_hu): Ditto.
(CODE_FOR_lasx_xvabsd_wu): Ditto.
(CODE_FOR_lasx_xvabsd_du): Ditto.
* config/loongarch/loongarch.md (u): Add smax/umax.
* config/loongarch/lsx.md (SU_MAX): New iterator.
(su_min): New attr.
(lsx_vabsd_s_): Remove.
(abd3): New insn pattern.
(lsx_vabsd_u_): Remove.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/abd-lasx.c: New test.
* gcc.target/loongarch/abd-lsx.c: New test.

---
 gcc/config/loongarch/lasx.md  | 30 +++--
 gcc/config/loongarch/loongarch-builtins.cc| 32 -
 gcc/config/loongarch/loongarch.md |  6 +-
 gcc/config/loongarch/lsx.md   | 35 +-
 gcc/testsuite/gcc.target/loongarch/abd-lasx.c | 67 +++
 gcc/testsuite/gcc.target/loongarch/abd-lsx.c  | 67 +++
 6 files changed, 181 insertions(+), 56 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/abd-lasx.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/abd-lsx.c

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 071a5cb1733..bac4e3b9435 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -20,8 +20,6 @@
 ;;
 
 (define_c_enum "unspec" [
-  UNSPEC_LASX_XVABSD_S
-  UNSPEC_LASX_XVABSD_U
   UNSPEC_LASX_XVAVG_S
   UNSPEC_LASX_XVAVG_U
   UNSPEC_LASX_XVAVGR_S
@@ -1206,23 +1204,17 @@ (define_insn "usadd3"
   [(set_attr "type" "simd_int_arith")
(set_attr "mode" "")])
 
-(define_insn "lasx_xvabsd_s_"
+(define_insn "abd3"
   [(set (match_operand:ILASX 0 "register_operand" "=f")
-   (unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
-  (match_operand:ILASX 2 "register_operand" "f")]
- UNSPEC_LASX_XVABSD_S))]
-  "ISA_HAS_LASX"
-  "xvabsd.\t%u0,%u1,%u2"
-  [(set_attr "type" "simd_int_arith")
-   (set_attr "mode" "")])
-
-(define_insn "lasx_xvabsd_u_"
-  [(set (match_operand:ILASX 0 "register_operand" "=f")
-   (unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
-  (match_operand:ILASX 2 "register_operand" "f")]
- UNSPEC_LASX_XVABSD_U))]
+   (minus:ILASX
+ (SU_MAX:ILASX
+   (match_operand:ILASX 1 "register_operand" "f")
+   (match_operand:ILASX 2 "register_operand" "f"))
+ (:ILASX
+   (match_dup 1)
+   (match_dup 2]
   "ISA_HAS_LASX"
-  "xvabsd.\t%u0,%u1,%u2"
+  "xvabsd.\t%u0,%u1,%u2"
   [(set_attr "type" "simd_int_arith")
(set_attr "mode" "")])
 
@@ -4904,7 +4896,7 @@ (define_expand "usadv32qi"
   rtx t1 = gen_reg_rtx (V32QImode);
   rtx t2 = gen_reg_rtx (V16HImode);
   rtx t3 = gen_reg_rtx (V8SImode);
-  emit_insn (gen_lasx_xvabsd_u_bu (t1, operands[1], operands[2]));
+  emit_insn (gen_uabdv32qi3 (t1, operands[1], operands[2]));
   emit_insn (gen_lasx_xvhaddw_hu_bu (t2, t1, t1));
   emit_insn (gen_lasx_xvhaddw_wu_hu (t3, t2, t2));
   emit_insn (gen_addv8si3 (operands[0], t3, operands[3]));
@@ -4921,7 +4913,7 @@ (define_expand "ssadv32qi"
   rtx t1 = gen_reg_rtx (V32QImode);
   rtx t2 = gen_reg_rtx (V16HImode);
   rtx t3 = gen_reg_rtx (V8SImode);
-  emit_insn (gen_lasx_xvabsd_s_b (t1, operands[1], operands[2]));
+  emit_insn (gen_sabdv32qi3 (t1, operands[1], operands[2]));
   emit_insn (gen_lasx_xvhaddw_hu_bu (t2, t1, t1));
   emit_insn (gen_lasx_xvhaddw_wu_hu (t3, t2, t2));
   emit_insn (gen_addv8si3 (operands[0], t3, operands[3]));
diff --git a/gcc/config/loongarch/loongarch-builtins.cc 
b/gcc/config/loongarch/loongarch-builtins.cc
index 261c5eb5546..75313ae2c9b 100644
--- a/gcc/config/loongarch/loongarch-builtins.cc
+++ b/gcc/config/loongarch/loongarch-builtins.cc
@@ -448,14 +448,14 @@ AVAIL_ALL (lasx_frecipe, ISA_HAS_LASX && ISA_HAS_FRECIPE)
 #define CODE_FOR_lsx_vssub_hu CODE_FOR_lsx_vssub_u_hu
 #define CODE_FOR_lsx_vssub_wu CODE_FOR_lsx_vssub_u_wu
 #define CODE_FOR_lsx_vssub_du CODE_FOR_lsx_vssub_u_du
-#define CODE_FOR_lsx_vabsd_b CODE_FOR_lsx_vabsd_s_b
-#define CODE_FOR_lsx_vabsd_h CODE_FOR_lsx_vabsd_s_h
-#define CODE_FOR_lsx_vabsd_w CODE_FOR_lsx_vabsd_s_w
-#define CODE_FOR_lsx_vabsd_d CODE_FOR_lsx_vabsd_s_d
-#define CODE_FOR_lsx_vabsd_bu CODE_FOR_lsx_vabsd_u_bu
-#define CODE_FOR_lsx_vabs

[PATCH] LoongArch: Add some vector pack/unpack patterns

2024-12-29 Thread Guo Jie
gcc/ChangeLog:

* config/loongarch/lasx.md (vec_unpacks_lo_): Redefine.
(vec_unpacku_lo_): Ditto.
(lasx_vext2xv_h_b): Replaced by vec_unpack_lo_v32qi.
(vec_unpack_lo_v32qi): New insn.
(lasx_vext2xv_w_h): Replaced by vec_unpack_lo_v16hi.
(vec_unpack_lo_v16qi_internal): New insn, for 128 bits.
(vec_unpack_lo_v16hi): New insn.
(lasx_vext2xv_d_w): Replaced by vec_unpack_lo_v8si.
(vec_unpack_lo_v8hi_internal): New insn, for 128 bits.
(vec_unpack_lo_v8si): New insn.
(vec_unpack_lo_v4si_internal): New insn, for 128 bits.
(vec_packs_float_v4di): New expander.
(vec_pack_sfix_trunc_v4df): Ditto.
(vec_unpacks_float_hi_v8si): Ditto.
(vec_unpacks_float_lo_v8si): Ditto.
(vec_unpack_sfix_trunc_hi_v8sf): Ditto.
(vec_unpack_sfix_trunc_lo_v8sf): Ditto.
* config/loongarch/loongarch-builtins.cc
(CODE_FOR_lsx_vftintrz_w_d): Rename.
(CODE_FOR_lsx_vftintrzh_l_s): Ditto.
(CODE_FOR_lsx_vftintrzl_l_s): Ditto.
(CODE_FOR_lsx_vffint_s_l): Ditto.
(CODE_FOR_lsx_vffinth_d_w): Ditto.
(CODE_FOR_lsx_vffintl_d_w): Ditto.
(CODE_FOR_lsx_vexth_h_b): Ditto.
(CODE_FOR_lsx_vexth_w_h): Ditto.
(CODE_FOR_lsx_vexth_d_w): Ditto.
(CODE_FOR_lsx_vexth_hu_bu): Ditto.
(CODE_FOR_lsx_vexth_wu_hu): Ditto.
(CODE_FOR_lsx_vexth_du_wu): Ditto.
(CODE_FOR_lsx_vfcvth_d_s): Ditto.
(CODE_FOR_lsx_vfcvtl_d_s): Ditto.
(CODE_FOR_lasx_vext2xv_h_b): Ditto.
(CODE_FOR_lasx_vext2xv_w_h): Ditto.
(CODE_FOR_lasx_vext2xv_d_w): Ditto.
(CODE_FOR_lasx_vext2xv_hu_bu): Ditto.
(CODE_FOR_lasx_vext2xv_wu_hu): Ditto.
(CODE_FOR_lasx_vext2xv_du_wu): Ditto.
(loongarch_expand_builtin_insn): Swap source operands in
CODE_FOR_lsx_vftintrz_w_d and CODE_FOR_lsx_vffint_s_l.
* config/loongarch/loongarch-protos.h
(loongarch_expand_vec_unpack): Remove useless parameter high_p.
* config/loongarch/loongarch.cc (loongarch_expand_vec_unpack):
Rewrite.
* config/loongarch/lsx.md (vec_unpacks_hi_v4sf): Redefine.
(vec_unpacks_lo_v4sf): Ditto.
(vec_unpacks_hi_): Ditto.
(vec_unpacku_hi_): Ditto.
(lsx_vfcvth_d_s): Replaced by vec_unpacks_hi_v4sf.
(lsx_vfcvtl_d_s): Replaced by vec_unpacks_lo_v4sf.
(lsx_vffint_s_l): Replaced by vec_packs_float_v2di.
(vec_packs_float_v2di): New insn.
(lsx_vftintrz_w_d): Replaced by vec_pack_sfix_trunc_v2df.
(vec_pack_sfix_trunc_v2df): New insn.
(lsx_vffinth_d_w): Replaced by vec_unpacks_float_hi_v4si.
(vec_unpacks_float_hi_v4si): New insn.
(lsx_vffintl_d_w): Replaced by vec_unpacks_float_lo_v4si.
(vec_unpacks_float_lo_v4si): New insn.
(lsx_vftintrzh_l_s): Replaced by vec_unpack_sfix_trunc_hi_v4sf.
(vec_unpack_sfix_trunc_hi_v4sf): New insn.
(lsx_vftintrzl_l_s): Replaced by vec_unpack_sfix_trunc_lo_v4sf.
(vec_unpack_sfix_trunc_lo_v4sf): New insn.
(lsx_vexth_h_b): Replaced by vec_unpack_hi_v16qi.
(vec_unpack_hi_v16qi): New insn.
(lsx_vexth_w_h): Replaced by vec_unpack_hi_v8hi.
(vec_unpack_hi_v8hi): New insn.
(lsx_vexth_d_w): Replaced by vec_unpack_hi_v4si.
(vec_unpack_hi_v4si): New insn.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vec_pack_unpack_128.c: New test.
* gcc.target/loongarch/vec_pack_unpack_256.c: New test.

---
 gcc/config/loongarch/lasx.md  | 140 +++---
 gcc/config/loongarch/loongarch-builtins.cc|  22 +++
 gcc/config/loongarch/loongarch-protos.h   |   2 +-
 gcc/config/loongarch/loongarch.cc |  49 ++
 gcc/config/loongarch/lsx.md   | 120 ++-
 .../loongarch/vec_pack_unpack_128.c   | 120 +++
 .../loongarch/vec_pack_unpack_256.c   | 118 +++
 7 files changed, 436 insertions(+), 135 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vec_pack_unpack_128.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/vec_pack_unpack_256.c

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 071a5cb1733..d9e6043c029 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -525,17 +525,7 @@ (define_expand "vec_unpacks_hi_"
(match_operand:ILASX_WHB 1 "register_operand")]
   "ISA_HAS_LASX"
 {
-  loongarch_expand_vec_unpack (operands, false/*unsigned_p*/,
-  true/*high_p*/);
-  DONE;
-})
-
-(define_expand "vec_unpacks_lo_"
-  [(match_operand: 0 "register_operand")
-   (match_operand:ILASX_WHB 1 "register_operand")]
-  "ISA_HAS_LASX"
-{
-  loongarch_expand_vec_unpack (operands, false/*unsigned_p*/, false/*high_p*/);
+  loongarch_expand_vec_unpack (operands, false/*unsigned_p*/);
   DONE;
 })
 
@@ -544,16 +534,7 @@ (define_exp

[PATCH] LoongArch: Adjust insn patterns for better combine

2024-12-29 Thread Guo Jie
For some instruction patterns with commutative operands,
the order of operands needs to be adjusted to match the rules.

gcc/ChangeLog:

* config/loongarch/loongarch.md
(bytepick_d__rev): New combiner.
(bstrpick_alsl_paired): Reorder input operands.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/bstrpick_alsl_paired.c: New test.
* gcc.target/loongarch/bytepick_combine.c: New test.

---
 gcc/config/loongarch/loongarch.md | 23 ++-
 .../loongarch/bstrpick_alsl_paired.c  | 21 +
 .../gcc.target/loongarch/bytepick_combine.c   | 11 +
 3 files changed, 49 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/bstrpick_alsl_paired.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/bytepick_combine.c

diff --git a/gcc/config/loongarch/loongarch.md 
b/gcc/config/loongarch/loongarch.md
index 7a110ca9de6..1c294d8088a 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -3111,13 +3111,14 @@ (define_insn "zero_extend_ashift"
 
 (define_insn "bstrpick_alsl_paired"
   [(set (match_operand:DI 0 "register_operand" "=&r")
-   (plus:DI (match_operand:DI 1 "register_operand" "r")
-(and:DI (ashift:DI (match_operand:DI 2 "register_operand" "r")
-   (match_operand 3 "const_immalsl_operand" 
""))
-(match_operand 4 "immediate_operand" ""]
+   (plus:DI
+ (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
+(match_operand 2 "const_immalsl_operand" ""))
+ (match_operand 3 "immediate_operand" ""))
+ (match_operand:DI 4 "register_operand" "r")))]
   "TARGET_64BIT
-   && ((INTVAL (operands[4]) >> INTVAL (operands[3])) == 0x)"
-  "bstrpick.d\t%0,%2,31,0\n\talsl.d\t%0,%0,%1,%3"
+   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0x)"
+  "bstrpick.d\t%0,%1,31,0\n\talsl.d\t%0,%0,%4,%2"
   [(set_attr "type" "arith")
(set_attr "mode" "DI")
(set_attr "insn_count" "2")])
@@ -4221,6 +4222,16 @@ (define_insn "bytepick_d_"
   "bytepick.d\t%0,%1,%2,"
   [(set_attr "mode" "DI")])
 
+(define_insn "bytepick_d__rev"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (ior:DI (ashift (match_operand:DI 1 "register_operand" "r")
+   (const_int bytepick_d_ashift_amount))
+   (lshiftrt (match_operand:DI 2 "register_operand" "r")
+ (const_int ]
+  "TARGET_64BIT"
+  "bytepick.d\t%0,%2,%1,"
+  [(set_attr "mode" "DI")])
+
 (define_insn "bitrev_4b"
   [(set (match_operand:SI 0 "register_operand" "=r")
(unspec:SI [(match_operand:SI 1 "register_operand" "r")]
diff --git a/gcc/testsuite/gcc.target/loongarch/bstrpick_alsl_paired.c 
b/gcc/testsuite/gcc.target/loongarch/bstrpick_alsl_paired.c
new file mode 100644
index 000..0bca3886c32
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/bstrpick_alsl_paired.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-mabi=lp64d -O2 -fdump-rtl-combine" } */
+/* { dg-final { scan-rtl-dump "{bstrpick_alsl_paired}" "combine" } } */
+/* { dg-final { scan-assembler-not 
"alsl.d\t\\\$r\[0-9\]+,\\\$r\[0-9\]+,\\\$r0" } } */
+
+struct SA
+{
+  const char *a;
+  unsigned int b : 16;
+  unsigned int c : 16;
+};
+
+extern struct SA SAs[];
+
+void
+test ()
+{
+  unsigned int i;
+  for (i = 0; i < 100; i++)
+SAs[i].c = i;
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/bytepick_combine.c 
b/gcc/testsuite/gcc.target/loongarch/bytepick_combine.c
new file mode 100644
index 000..2a880829ca5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/bytepick_combine.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not "slli\\.d" } } */
+/* { dg-final { scan-assembler-not "srli\\.d" } } */
+/* { dg-final { scan-assembler-times "bytepick\\.d" 1 } } */
+
+unsigned long
+bytepick_d_n (unsigned long a, unsigned long b)
+{
+  return a >> 56 | b << 8;
+}
-- 
2.20.1



[PATCH] LoongArch: Optimize for conditional move operations

2024-12-29 Thread Guo Jie
The optimization example is as follows.

From:
  if (condition)
dest += 1 << 16;
To:
  dest += (condition ? 1 : 0) << 16;

It does not use maskeqz and masknez, thus reducing the number of
instructions.

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(loongarch_expand_conditional_move): Add some optimization
implementations based on noce_try_cmove_arith.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/conditional-move-opt-1.c: New test.
* gcc.target/loongarch/conditional-move-opt-2.c: New test.

---
 gcc/config/loongarch/loongarch.cc | 103 +-
 .../loongarch/conditional-move-opt-1.c|  58 ++
 .../loongarch/conditional-move-opt-2.c|  42 +++
 3 files changed, 202 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/conditional-move-opt-1.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/conditional-move-opt-2.c

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 2d4290bc2d1..32fd1697813 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -5294,6 +5294,81 @@ loongarch_expand_conditional_move (rtx *operands)
 loongarch_emit_float_compare (&code, &op0, &op1);
   else
 {
+  /* Optimize to reduce the number of instructions for ternary operations.
+Mainly implemented based on noce_try_cmove_arith.
+For dest = (condition) ? value_if_true : value_if_false;
+the optimization requires:
+ a. value_if_false = var;
+ b. value_if_true = var OP C (a positive integer power of 2).
+
+Situations similar to the following:
+   if (condition)
+ dest += 1 << imm;
+to:
+   dest += (condition ? 1 : 0) << imm;  */
+
+  rtx_insn *insn;
+  HOST_WIDE_INT val = 0; /* The value of rtx C.  */
+  /* INSN with operands[2] as the output.  */
+  rtx_insn *value_if_true_insn = NULL;
+  /* INSN with operands[3] as the output.  */
+  rtx_insn *value_if_false_insn = NULL;
+  rtx value_if_true_insn_src = NULL_RTX;
+  /* Common operand var in value_if_true and value_if_false.  */
+  rtx comm_var = NULL_RTX;
+  bool can_be_optimized = false;
+
+  /* Search value_if_true_insn and value_if_false_insn.  */
+  struct sequence_stack *seq = get_current_sequence ()->next;
+  for (insn = seq->last; insn; insn = PREV_INSN (insn))
+   {
+ if (single_set (insn))
+   {
+ rtx set_dest = SET_DEST (single_set (insn));
+ if (rtx_equal_p (set_dest, operands[2]))
+   value_if_true_insn = insn;
+ else if (rtx_equal_p (set_dest, operands[3]))
+   value_if_false_insn = insn;
+ if (value_if_true_insn && value_if_false_insn)
+   break;
+   }
+   }
+
+  /* Check if the optimization conditions are met.  */
+  if (value_if_true_insn
+ && value_if_false_insn
+ /* Make sure that value_if_false and var are the same.  */
+ && BINARY_P (value_if_true_insn_src
+  = SET_SRC (single_set (value_if_true_insn)))
+ /* Make sure that both value_if_true and value_if_false
+has the same var.  */
+ && rtx_equal_p (XEXP (value_if_true_insn_src, 0),
+ SET_SRC (single_set (value_if_false_insn
+   {
+ comm_var = SET_SRC (single_set (value_if_false_insn));
+ rtx src = XEXP (value_if_true_insn_src, 1);
+ rtx imm = NULL_RTX;
+ if (CONST_INT_P (src))
+   imm = src;
+ else
+   for (insn = seq->last; insn; insn = PREV_INSN (insn))
+ {
+   rtx set = single_set (insn);
+   if (set && rtx_equal_p (SET_DEST (set), src))
+ {
+   imm = SET_SRC (set);
+   break;
+ }
+ }
+ if (imm && CONST_INT_P (imm))
+   {
+ val = INTVAL (imm);
+ /* Make sure that imm is a positive integer power of 2.  */
+ if (val > 0 && !(val & (val - 1)))
+   can_be_optimized = true;
+   }
+   }
+
   if (GET_MODE_SIZE (GET_MODE (op0)) < UNITS_PER_WORD)
{
  promote_op[0] = (REG_P (op0) && REG_P (operands[2]) &&
@@ -5314,22 +5389,48 @@ loongarch_expand_conditional_move (rtx *operands)
   op0_extend = op0;
   op1_extend = force_reg (word_mode, op1);
 
+  rtx target = gen_reg_rtx (GET_MODE (op0));
+
   if (code == EQ || code == NE)
{
  op0 = loongarch_zero_if_equal (op0, op1);
  op1 = const0_rtx;
+ /* For EQ, set target to 1 if op0 and op1 are the same,
+otherwise set to 0.
+For NE, set target to 0 if op0 and op1 are the same,
+otherwise set to 1.  */
+ if (can_be_optimized)
+   loongarch_emit_binary (co

[PATCH v2] LoongArch: Add standard patterns uabd and sabd

2024-12-29 Thread Guo Jie
gcc/ChangeLog:

* config/loongarch/lasx.md (lasx_xvabsd_s_): Remove.
(abd3): New insn pattern.
(lasx_xvabsd_u_): Remove.
* config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vabsd_b):
Rename.
(CODE_FOR_lsx_vabsd_h): Ditto.
(CODE_FOR_lsx_vabsd_w): Ditto.
(CODE_FOR_lsx_vabsd_d): Ditto.
(CODE_FOR_lsx_vabsd_bu): Ditto.
(CODE_FOR_lsx_vabsd_hu): Ditto.
(CODE_FOR_lsx_vabsd_wu): Ditto.
(CODE_FOR_lsx_vabsd_du): Ditto.
(CODE_FOR_lasx_xvabsd_b): Ditto.
(CODE_FOR_lasx_xvabsd_h): Ditto.
(CODE_FOR_lasx_xvabsd_w): Ditto.
(CODE_FOR_lasx_xvabsd_d): Ditto.
(CODE_FOR_lasx_xvabsd_bu): Ditto.
(CODE_FOR_lasx_xvabsd_hu): Ditto.
(CODE_FOR_lasx_xvabsd_wu): Ditto.
(CODE_FOR_lasx_xvabsd_du): Ditto.
* config/loongarch/loongarch.md (u): Add smax/umax.
* config/loongarch/lsx.md (SU_MAX): New iterator.
(su_min): New attr.
(lsx_vabsd_s_): Remove.
(abd3): New insn pattern.
(lsx_vabsd_u_): Remove.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/abd-lasx.c: New test.
* gcc.target/loongarch/abd-lsx.c: New test.
---
 gcc/config/loongarch/lasx.md  | 30 +++--
 gcc/config/loongarch/loongarch-builtins.cc| 32 -
 gcc/config/loongarch/loongarch.md |  6 +-
 gcc/config/loongarch/lsx.md   | 35 +-
 gcc/testsuite/gcc.target/loongarch/abd-lasx.c | 67 +++
 gcc/testsuite/gcc.target/loongarch/abd-lsx.c  | 67 +++
 6 files changed, 181 insertions(+), 56 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/abd-lasx.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/abd-lsx.c

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 071a5cb1733..bac4e3b9435 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -20,8 +20,6 @@
 ;;
 
 (define_c_enum "unspec" [
-  UNSPEC_LASX_XVABSD_S
-  UNSPEC_LASX_XVABSD_U
   UNSPEC_LASX_XVAVG_S
   UNSPEC_LASX_XVAVG_U
   UNSPEC_LASX_XVAVGR_S
@@ -1206,23 +1204,17 @@ (define_insn "usadd3"
   [(set_attr "type" "simd_int_arith")
(set_attr "mode" "")])
 
-(define_insn "lasx_xvabsd_s_"
+(define_insn "abd3"
   [(set (match_operand:ILASX 0 "register_operand" "=f")
-   (unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
-  (match_operand:ILASX 2 "register_operand" "f")]
- UNSPEC_LASX_XVABSD_S))]
-  "ISA_HAS_LASX"
-  "xvabsd.\t%u0,%u1,%u2"
-  [(set_attr "type" "simd_int_arith")
-   (set_attr "mode" "")])
-
-(define_insn "lasx_xvabsd_u_"
-  [(set (match_operand:ILASX 0 "register_operand" "=f")
-   (unspec:ILASX [(match_operand:ILASX 1 "register_operand" "f")
-  (match_operand:ILASX 2 "register_operand" "f")]
- UNSPEC_LASX_XVABSD_U))]
+   (minus:ILASX
+ (SU_MAX:ILASX
+   (match_operand:ILASX 1 "register_operand" "f")
+   (match_operand:ILASX 2 "register_operand" "f"))
+ (:ILASX
+   (match_dup 1)
+   (match_dup 2]
   "ISA_HAS_LASX"
-  "xvabsd.\t%u0,%u1,%u2"
+  "xvabsd.\t%u0,%u1,%u2"
   [(set_attr "type" "simd_int_arith")
(set_attr "mode" "")])
 
@@ -4904,7 +4896,7 @@ (define_expand "usadv32qi"
   rtx t1 = gen_reg_rtx (V32QImode);
   rtx t2 = gen_reg_rtx (V16HImode);
   rtx t3 = gen_reg_rtx (V8SImode);
-  emit_insn (gen_lasx_xvabsd_u_bu (t1, operands[1], operands[2]));
+  emit_insn (gen_uabdv32qi3 (t1, operands[1], operands[2]));
   emit_insn (gen_lasx_xvhaddw_hu_bu (t2, t1, t1));
   emit_insn (gen_lasx_xvhaddw_wu_hu (t3, t2, t2));
   emit_insn (gen_addv8si3 (operands[0], t3, operands[3]));
@@ -4921,7 +4913,7 @@ (define_expand "ssadv32qi"
   rtx t1 = gen_reg_rtx (V32QImode);
   rtx t2 = gen_reg_rtx (V16HImode);
   rtx t3 = gen_reg_rtx (V8SImode);
-  emit_insn (gen_lasx_xvabsd_s_b (t1, operands[1], operands[2]));
+  emit_insn (gen_sabdv32qi3 (t1, operands[1], operands[2]));
   emit_insn (gen_lasx_xvhaddw_hu_bu (t2, t1, t1));
   emit_insn (gen_lasx_xvhaddw_wu_hu (t3, t2, t2));
   emit_insn (gen_addv8si3 (operands[0], t3, operands[3]));
diff --git a/gcc/config/loongarch/loongarch-builtins.cc 
b/gcc/config/loongarch/loongarch-builtins.cc
index 261c5eb5546..75313ae2c9b 100644
--- a/gcc/config/loongarch/loongarch-builtins.cc
+++ b/gcc/config/loongarch/loongarch-builtins.cc
@@ -448,14 +448,14 @@ AVAIL_ALL (lasx_frecipe, ISA_HAS_LASX && ISA_HAS_FRECIPE)
 #define CODE_FOR_lsx_vssub_hu CODE_FOR_lsx_vssub_u_hu
 #define CODE_FOR_lsx_vssub_wu CODE_FOR_lsx_vssub_u_wu
 #define CODE_FOR_lsx_vssub_du CODE_FOR_lsx_vssub_u_du
-#define CODE_FOR_lsx_vabsd_b CODE_FOR_lsx_vabsd_s_b
-#define CODE_FOR_lsx_vabsd_h CODE_FOR_lsx_vabsd_s_h
-#define CODE_FOR_lsx_vabsd_w CODE_FOR_lsx_vabsd_s_w
-#define CODE_FOR_lsx_vabsd_d CODE_FOR_lsx_vabsd_s_d
-#define CODE_FOR_lsx_vabsd_bu CODE_FOR_lsx_vabsd_u_bu
-#define CODE_FOR_lsx_vabsd

Re: [PATCH] LoongArch: Add standard patterns uabd and sabd

2024-12-29 Thread Guo Jie

Thank you. I will fix it in patch v2.

在 2024/12/30 上午11:55, Xi Ruoyao 写道:

On Mon, 2024-12-30 at 10:38 +0800, Guo Jie wrote:

diff --git a/gcc/config/loongarch/loongarch.md
b/gcc/config/loongarch/loongarch.md
index 7a110ca9de6..d4287012b3c 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -527,13 +527,15 @@ (define_code_attr u [(sign_extend "")
(zero_extend "u")
     (gt "") (gtu "u")
     (ge "") (geu "u")
     (lt "") (ltu "u")
-    (le "") (leu "u")])
+    (le "") (leu "u")
+ (smax "") (umax "u")])

Inconsistent indent.





Re: [PATCH] LoongArch: Optimize for conditional move operations

2024-12-29 Thread Guo Jie

Thanks for your suggestion!

Indeed, there are still some scenarios that can be optimized and 
improved next.


在 2024/12/30 下午12:06, Xi Ruoyao 写道:

On Mon, 2024-12-30 at 10:39 +0800, Guo Jie wrote:

+     /* Make sure that imm is a positive integer power of
2.  */

Maybe we should also consider the case $imm = 2^k + 1$ as they can be
implemented with sl[te] and bstrins.[wd].  But it can be done in another
patch anyway.


+     if (val > 0 && !(val & (val - 1)))
+   can_be_optimized = true;