[llvm-branch-commits] [llvm] release/18.x: [X86] Add missing subvector_subreg_lowering for BF16 (#83720) (PR #83758)

2024-03-04 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang closed 
https://github.com/llvm/llvm-project/pull/83758
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86] Add missing subvector_subreg_lowering for BF16 (#83720) (PR #83758)

2024-03-04 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

Some change related on trunk code, I'll create a manual cherry-pick.

https://github.com/llvm/llvm-project/pull/83758
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Add missing subvector_subreg_lowering for BF16 (#83720) (PR #83811)

2024-03-04 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang created 
https://github.com/llvm/llvm-project/pull/83811

None

>From 7210a98061d9f6382e8ce4ca7646ec4a5c3f96e6 Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Mon, 4 Mar 2024 17:28:23 +0800
Subject: [PATCH] [X86] Add missing subvector_subreg_lowering for BF16 (#83720)

---
 llvm/lib/Target/X86/X86InstrVecCompiler.td|  3 +++
 .../CodeGen/X86/avx512bf16-vl-intrinsics.ll   | 22 +++
 2 files changed, 25 insertions(+)

diff --git a/llvm/lib/Target/X86/X86InstrVecCompiler.td 
b/llvm/lib/Target/X86/X86InstrVecCompiler.td
index 70bd77bba03ab3..00da13407bde68 100644
--- a/llvm/lib/Target/X86/X86InstrVecCompiler.td
+++ b/llvm/lib/Target/X86/X86InstrVecCompiler.td
@@ -83,6 +83,7 @@ defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
+defm : subvector_subreg_lowering;
 
 // A 128-bit subvector extract from the first 512-bit vector position is a
 // subregister copy that needs no instruction. Likewise, a 128-bit subvector
@@ -95,6 +96,7 @@ defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
+defm : subvector_subreg_lowering;
 
 // A 128-bit subvector extract from the first 512-bit vector position is a
 // subregister copy that needs no instruction. Likewise, a 128-bit subvector
@@ -107,6 +109,7 @@ defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
+defm : subvector_subreg_lowering;
 
 
 // If we're inserting into an all zeros vector, just use a plain move which
diff --git a/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll 
b/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
index 40b512d68be816..6b07d69d2b0b1c 100644
--- a/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
+++ b/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
@@ -402,3 +402,25 @@ entry:
   %1 = shufflevector <8 x bfloat> %0, <8 x bfloat> undef, <16 x i32> 
zeroinitializer
   ret <16 x bfloat> %1
 }
+
+define <16 x i32> @pr83358() {
+; X86-LABEL: pr83358:
+; X86:   # %bb.0:
+; X86-NEXT:vcvtneps2bf16y {{\.?LCPI[0-9]+_[0-9]+}}, %xmm0 # encoding: 
[0x62,0xf2,0x7e,0x28,0x72,0x05,A,A,A,A]
+; X86-NEXT:# fixup A - offset: 6, value: {{\.?LCPI[0-9]+_[0-9]+}}, kind: 
FK_Data_4
+; X86-NEXT:vinsertf128 $1, %xmm0, %ymm0, %ymm0 # EVEX TO VEX Compression 
encoding: [0xc4,0xe3,0x7d,0x18,0xc0,0x01]
+; X86-NEXT:vinsertf64x4 $1, %ymm0, %zmm0, %zmm0 # encoding: 
[0x62,0xf3,0xfd,0x48,0x1a,0xc0,0x01]
+; X86-NEXT:retl # encoding: [0xc3]
+;
+; X64-LABEL: pr83358:
+; X64:   # %bb.0:
+; X64-NEXT:vcvtneps2bf16y {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0 # 
encoding: [0x62,0xf2,0x7e,0x28,0x72,0x05,A,A,A,A]
+; X64-NEXT:# fixup A - offset: 6, value: {{\.?LCPI[0-9]+_[0-9]+}}-4, kind: 
reloc_riprel_4byte
+; X64-NEXT:vinsertf128 $1, %xmm0, %ymm0, %ymm0 # EVEX TO VEX Compression 
encoding: [0xc4,0xe3,0x7d,0x18,0xc0,0x01]
+; X64-NEXT:vinsertf64x4 $1, %ymm0, %zmm0, %zmm0 # encoding: 
[0x62,0xf3,0xfd,0x48,0x1a,0xc0,0x01]
+; X64-NEXT:retq # encoding: [0xc3]
+  %1 = call <8 x bfloat> @llvm.x86.avx512bf16.cvtneps2bf16.256(<8 x float> 
)
+  %2 = bitcast <8 x bfloat> %1 to <4 x i32>
+  %3 = shufflevector <4 x i32> %2, <4 x i32> undef, <16 x i32> 
+  ret <16 x i32> %3
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Add missing subvector_subreg_lowering for BF16 (#83720) (PR #83834)

2024-03-04 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang created 
https://github.com/llvm/llvm-project/pull/83834

None

>From 6d03789303fe3d5c84406df29353ddf63199eb08 Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Mon, 4 Mar 2024 18:09:41 +0800
Subject: [PATCH] [X86] Add missing subvector_subreg_lowering for BF16 (#83720)

---
 llvm/lib/Target/X86/X86InstrVecCompiler.td|  3 +++
 .../CodeGen/X86/avx512bf16-vl-intrinsics.ll   | 22 +++
 llvm/test/CodeGen/X86/bfloat.ll   |  1 -
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/X86/X86InstrVecCompiler.td 
b/llvm/lib/Target/X86/X86InstrVecCompiler.td
index bbd19cf8d5b25e..461b2badc13134 100644
--- a/llvm/lib/Target/X86/X86InstrVecCompiler.td
+++ b/llvm/lib/Target/X86/X86InstrVecCompiler.td
@@ -83,6 +83,7 @@ defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
+defm : subvector_subreg_lowering;
 
 // A 128-bit subvector extract from the first 512-bit vector position is a
 // subregister copy that needs no instruction. Likewise, a 128-bit subvector
@@ -95,6 +96,7 @@ defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
+defm : subvector_subreg_lowering;
 
 // A 128-bit subvector extract from the first 512-bit vector position is a
 // subregister copy that needs no instruction. Likewise, a 128-bit subvector
@@ -107,6 +109,7 @@ defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
+defm : subvector_subreg_lowering;
 
 
 // If we're inserting into an all zeros vector, just use a plain move which
diff --git a/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll 
b/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
index 0826faa1071b01..482713e12d15c7 100644
--- a/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
+++ b/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
@@ -381,3 +381,25 @@ entry:
   %1 = shufflevector <8 x bfloat> %0, <8 x bfloat> undef, <16 x i32> 
zeroinitializer
   ret <16 x bfloat> %1
 }
+
+define <16 x i32> @pr83358() {
+; X86-LABEL: pr83358:
+; X86:   # %bb.0:
+; X86-NEXT:vcvtneps2bf16y {{\.?LCPI[0-9]+_[0-9]+}}, %xmm0 # encoding: 
[0x62,0xf2,0x7e,0x28,0x72,0x05,A,A,A,A]
+; X86-NEXT:# fixup A - offset: 6, value: {{\.?LCPI[0-9]+_[0-9]+}}, kind: 
FK_Data_4
+; X86-NEXT:vshufi64x2 $0, %zmm0, %zmm0, %zmm0 # encoding: 
[0x62,0xf3,0xfd,0x48,0x43,0xc0,0x00]
+; X86-NEXT:# zmm0 = zmm0[0,1,0,1,0,1,0,1]
+; X86-NEXT:retl # encoding: [0xc3]
+;
+; X64-LABEL: pr83358:
+; X64:   # %bb.0:
+; X64-NEXT:vcvtneps2bf16y {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0 # 
encoding: [0x62,0xf2,0x7e,0x28,0x72,0x05,A,A,A,A]
+; X64-NEXT:# fixup A - offset: 6, value: {{\.?LCPI[0-9]+_[0-9]+}}-4, kind: 
reloc_riprel_4byte
+; X64-NEXT:vshufi64x2 $0, %zmm0, %zmm0, %zmm0 # encoding: 
[0x62,0xf3,0xfd,0x48,0x43,0xc0,0x00]
+; X64-NEXT:# zmm0 = zmm0[0,1,0,1,0,1,0,1]
+; X64-NEXT:retq # encoding: [0xc3]
+  %1 = call <8 x bfloat> @llvm.x86.avx512bf16.cvtneps2bf16.256(<8 x float> 
)
+  %2 = bitcast <8 x bfloat> %1 to <4 x i32>
+  %3 = shufflevector <4 x i32> %2, <4 x i32> undef, <16 x i32> 
+  ret <16 x i32> %3
+}
diff --git a/llvm/test/CodeGen/X86/bfloat.ll b/llvm/test/CodeGen/X86/bfloat.ll
index f2d3c4fb34199e..cd1dba17611628 100644
--- a/llvm/test/CodeGen/X86/bfloat.ll
+++ b/llvm/test/CodeGen/X86/bfloat.ll
@@ -2423,7 +2423,6 @@ define <16 x bfloat> @fptrunc_v16f32(<16 x float> %a) 
nounwind {
 ; AVXNC-LABEL: fptrunc_v16f32:
 ; AVXNC:   # %bb.0:
 ; AVXNC-NEXT:{vex} vcvtneps2bf16 %ymm0, %xmm0
-; AVXNC-NEXT:vinsertf128 $0, %xmm0, %ymm0, %ymm0
 ; AVXNC-NEXT:{vex} vcvtneps2bf16 %ymm1, %xmm1
 ; AVXNC-NEXT:vinsertf128 $1, %xmm1, %ymm0, %ymm0
 ; AVXNC-NEXT:retq

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Add missing subvector_subreg_lowering for BF16 (#83720) (PR #83834)

2024-03-04 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang milestoned 
https://github.com/llvm/llvm-project/pull/83834
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][Inline] Skip inline asm in inlining target feature check (#83820) (PR #84029)

2024-03-06 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

Risk is minimum, LGTM.

https://github.com/llvm/llvm-project/pull/84029
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86]: Add FPCW as a rounding control register (PR #84058)

2024-03-06 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

I don't think it worth backport. The x87 usage is not very common nowadays, and 
the problem is lasting for a while.
I'd suggest not do the backport considering it may have sideeffect in the 
isCall change.

https://github.com/llvm/llvm-project/pull/84058
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86] combineAndShuffleNot - ensure the type is legal before create X86ISD::ANDNP target nodes (PR #84698)

2024-03-11 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

LGTM.

https://github.com/llvm/llvm-project/pull/84698
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Add missing subvector_subreg_lowering for BF16 (#83720) (PR #83834)

2024-03-11 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang updated 
https://github.com/llvm/llvm-project/pull/83834

>From 6d03789303fe3d5c84406df29353ddf63199eb08 Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Mon, 4 Mar 2024 18:09:41 +0800
Subject: [PATCH] [X86] Add missing subvector_subreg_lowering for BF16 (#83720)

---
 llvm/lib/Target/X86/X86InstrVecCompiler.td|  3 +++
 .../CodeGen/X86/avx512bf16-vl-intrinsics.ll   | 22 +++
 llvm/test/CodeGen/X86/bfloat.ll   |  1 -
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/X86/X86InstrVecCompiler.td 
b/llvm/lib/Target/X86/X86InstrVecCompiler.td
index bbd19cf8d5b25e..461b2badc13134 100644
--- a/llvm/lib/Target/X86/X86InstrVecCompiler.td
+++ b/llvm/lib/Target/X86/X86InstrVecCompiler.td
@@ -83,6 +83,7 @@ defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
+defm : subvector_subreg_lowering;
 
 // A 128-bit subvector extract from the first 512-bit vector position is a
 // subregister copy that needs no instruction. Likewise, a 128-bit subvector
@@ -95,6 +96,7 @@ defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
+defm : subvector_subreg_lowering;
 
 // A 128-bit subvector extract from the first 512-bit vector position is a
 // subregister copy that needs no instruction. Likewise, a 128-bit subvector
@@ -107,6 +109,7 @@ defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
+defm : subvector_subreg_lowering;
 
 
 // If we're inserting into an all zeros vector, just use a plain move which
diff --git a/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll 
b/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
index 0826faa1071b01..482713e12d15c7 100644
--- a/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
+++ b/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
@@ -381,3 +381,25 @@ entry:
   %1 = shufflevector <8 x bfloat> %0, <8 x bfloat> undef, <16 x i32> 
zeroinitializer
   ret <16 x bfloat> %1
 }
+
+define <16 x i32> @pr83358() {
+; X86-LABEL: pr83358:
+; X86:   # %bb.0:
+; X86-NEXT:vcvtneps2bf16y {{\.?LCPI[0-9]+_[0-9]+}}, %xmm0 # encoding: 
[0x62,0xf2,0x7e,0x28,0x72,0x05,A,A,A,A]
+; X86-NEXT:# fixup A - offset: 6, value: {{\.?LCPI[0-9]+_[0-9]+}}, kind: 
FK_Data_4
+; X86-NEXT:vshufi64x2 $0, %zmm0, %zmm0, %zmm0 # encoding: 
[0x62,0xf3,0xfd,0x48,0x43,0xc0,0x00]
+; X86-NEXT:# zmm0 = zmm0[0,1,0,1,0,1,0,1]
+; X86-NEXT:retl # encoding: [0xc3]
+;
+; X64-LABEL: pr83358:
+; X64:   # %bb.0:
+; X64-NEXT:vcvtneps2bf16y {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0 # 
encoding: [0x62,0xf2,0x7e,0x28,0x72,0x05,A,A,A,A]
+; X64-NEXT:# fixup A - offset: 6, value: {{\.?LCPI[0-9]+_[0-9]+}}-4, kind: 
reloc_riprel_4byte
+; X64-NEXT:vshufi64x2 $0, %zmm0, %zmm0, %zmm0 # encoding: 
[0x62,0xf3,0xfd,0x48,0x43,0xc0,0x00]
+; X64-NEXT:# zmm0 = zmm0[0,1,0,1,0,1,0,1]
+; X64-NEXT:retq # encoding: [0xc3]
+  %1 = call <8 x bfloat> @llvm.x86.avx512bf16.cvtneps2bf16.256(<8 x float> 
)
+  %2 = bitcast <8 x bfloat> %1 to <4 x i32>
+  %3 = shufflevector <4 x i32> %2, <4 x i32> undef, <16 x i32> 
+  ret <16 x i32> %3
+}
diff --git a/llvm/test/CodeGen/X86/bfloat.ll b/llvm/test/CodeGen/X86/bfloat.ll
index f2d3c4fb34199e..cd1dba17611628 100644
--- a/llvm/test/CodeGen/X86/bfloat.ll
+++ b/llvm/test/CodeGen/X86/bfloat.ll
@@ -2423,7 +2423,6 @@ define <16 x bfloat> @fptrunc_v16f32(<16 x float> %a) 
nounwind {
 ; AVXNC-LABEL: fptrunc_v16f32:
 ; AVXNC:   # %bb.0:
 ; AVXNC-NEXT:{vex} vcvtneps2bf16 %ymm0, %xmm0
-; AVXNC-NEXT:vinsertf128 $0, %xmm0, %ymm0, %ymm0
 ; AVXNC-NEXT:{vex} vcvtneps2bf16 %ymm1, %xmm1
 ; AVXNC-NEXT:vinsertf128 $1, %xmm1, %ymm0, %ymm0
 ; AVXNC-NEXT:retq

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86] Add missing subvector_subreg_lowering for BF16 (#83720) (PR #84491)

2024-03-11 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

Please review this patch instead https://github.com/llvm/llvm-project/pull/83834

https://github.com/llvm/llvm-project/pull/84491
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86] Add missing subvector_subreg_lowering for BF16 (#83720) (PR #83834)

2024-03-12 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang updated 
https://github.com/llvm/llvm-project/pull/83834

>From 6d03789303fe3d5c84406df29353ddf63199eb08 Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Mon, 4 Mar 2024 18:09:41 +0800
Subject: [PATCH] [X86] Add missing subvector_subreg_lowering for BF16 (#83720)

---
 llvm/lib/Target/X86/X86InstrVecCompiler.td|  3 +++
 .../CodeGen/X86/avx512bf16-vl-intrinsics.ll   | 22 +++
 llvm/test/CodeGen/X86/bfloat.ll   |  1 -
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/X86/X86InstrVecCompiler.td 
b/llvm/lib/Target/X86/X86InstrVecCompiler.td
index bbd19cf8d5b25e..461b2badc13134 100644
--- a/llvm/lib/Target/X86/X86InstrVecCompiler.td
+++ b/llvm/lib/Target/X86/X86InstrVecCompiler.td
@@ -83,6 +83,7 @@ defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
+defm : subvector_subreg_lowering;
 
 // A 128-bit subvector extract from the first 512-bit vector position is a
 // subregister copy that needs no instruction. Likewise, a 128-bit subvector
@@ -95,6 +96,7 @@ defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
+defm : subvector_subreg_lowering;
 
 // A 128-bit subvector extract from the first 512-bit vector position is a
 // subregister copy that needs no instruction. Likewise, a 128-bit subvector
@@ -107,6 +109,7 @@ defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
+defm : subvector_subreg_lowering;
 
 
 // If we're inserting into an all zeros vector, just use a plain move which
diff --git a/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll 
b/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
index 0826faa1071b01..482713e12d15c7 100644
--- a/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
+++ b/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
@@ -381,3 +381,25 @@ entry:
   %1 = shufflevector <8 x bfloat> %0, <8 x bfloat> undef, <16 x i32> 
zeroinitializer
   ret <16 x bfloat> %1
 }
+
+define <16 x i32> @pr83358() {
+; X86-LABEL: pr83358:
+; X86:   # %bb.0:
+; X86-NEXT:vcvtneps2bf16y {{\.?LCPI[0-9]+_[0-9]+}}, %xmm0 # encoding: 
[0x62,0xf2,0x7e,0x28,0x72,0x05,A,A,A,A]
+; X86-NEXT:# fixup A - offset: 6, value: {{\.?LCPI[0-9]+_[0-9]+}}, kind: 
FK_Data_4
+; X86-NEXT:vshufi64x2 $0, %zmm0, %zmm0, %zmm0 # encoding: 
[0x62,0xf3,0xfd,0x48,0x43,0xc0,0x00]
+; X86-NEXT:# zmm0 = zmm0[0,1,0,1,0,1,0,1]
+; X86-NEXT:retl # encoding: [0xc3]
+;
+; X64-LABEL: pr83358:
+; X64:   # %bb.0:
+; X64-NEXT:vcvtneps2bf16y {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0 # 
encoding: [0x62,0xf2,0x7e,0x28,0x72,0x05,A,A,A,A]
+; X64-NEXT:# fixup A - offset: 6, value: {{\.?LCPI[0-9]+_[0-9]+}}-4, kind: 
reloc_riprel_4byte
+; X64-NEXT:vshufi64x2 $0, %zmm0, %zmm0, %zmm0 # encoding: 
[0x62,0xf3,0xfd,0x48,0x43,0xc0,0x00]
+; X64-NEXT:# zmm0 = zmm0[0,1,0,1,0,1,0,1]
+; X64-NEXT:retq # encoding: [0xc3]
+  %1 = call <8 x bfloat> @llvm.x86.avx512bf16.cvtneps2bf16.256(<8 x float> 
)
+  %2 = bitcast <8 x bfloat> %1 to <4 x i32>
+  %3 = shufflevector <4 x i32> %2, <4 x i32> undef, <16 x i32> 
+  ret <16 x i32> %3
+}
diff --git a/llvm/test/CodeGen/X86/bfloat.ll b/llvm/test/CodeGen/X86/bfloat.ll
index f2d3c4fb34199e..cd1dba17611628 100644
--- a/llvm/test/CodeGen/X86/bfloat.ll
+++ b/llvm/test/CodeGen/X86/bfloat.ll
@@ -2423,7 +2423,6 @@ define <16 x bfloat> @fptrunc_v16f32(<16 x float> %a) 
nounwind {
 ; AVXNC-LABEL: fptrunc_v16f32:
 ; AVXNC:   # %bb.0:
 ; AVXNC-NEXT:{vex} vcvtneps2bf16 %ymm0, %xmm0
-; AVXNC-NEXT:vinsertf128 $0, %xmm0, %ymm0, %ymm0
 ; AVXNC-NEXT:{vex} vcvtneps2bf16 %ymm1, %xmm1
 ; AVXNC-NEXT:vinsertf128 $1, %xmm1, %ymm0, %ymm0
 ; AVXNC-NEXT:retq

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86] Add missing subvector_subreg_lowering for BF16 (#83720) (PR #83834)

2024-03-12 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang updated 
https://github.com/llvm/llvm-project/pull/83834

>From 9cec3b7b1ea0491df688555a51750efe6c9bd075 Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Mon, 4 Mar 2024 18:09:41 +0800
Subject: [PATCH] [X86] Add missing subvector_subreg_lowering for BF16 (#83720)

---
 llvm/lib/Target/X86/X86InstrVecCompiler.td|  3 +++
 .../CodeGen/X86/avx512bf16-vl-intrinsics.ll   | 22 +++
 llvm/test/CodeGen/X86/bfloat.ll   |  1 -
 3 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/X86/X86InstrVecCompiler.td 
b/llvm/lib/Target/X86/X86InstrVecCompiler.td
index bbd19cf8d5b25e..461b2badc13134 100644
--- a/llvm/lib/Target/X86/X86InstrVecCompiler.td
+++ b/llvm/lib/Target/X86/X86InstrVecCompiler.td
@@ -83,6 +83,7 @@ defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
+defm : subvector_subreg_lowering;
 
 // A 128-bit subvector extract from the first 512-bit vector position is a
 // subregister copy that needs no instruction. Likewise, a 128-bit subvector
@@ -95,6 +96,7 @@ defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
+defm : subvector_subreg_lowering;
 
 // A 128-bit subvector extract from the first 512-bit vector position is a
 // subregister copy that needs no instruction. Likewise, a 128-bit subvector
@@ -107,6 +109,7 @@ defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
 defm : subvector_subreg_lowering;
+defm : subvector_subreg_lowering;
 
 
 // If we're inserting into an all zeros vector, just use a plain move which
diff --git a/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll 
b/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
index 0826faa1071b01..482713e12d15c7 100644
--- a/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
+++ b/llvm/test/CodeGen/X86/avx512bf16-vl-intrinsics.ll
@@ -381,3 +381,25 @@ entry:
   %1 = shufflevector <8 x bfloat> %0, <8 x bfloat> undef, <16 x i32> 
zeroinitializer
   ret <16 x bfloat> %1
 }
+
+define <16 x i32> @pr83358() {
+; X86-LABEL: pr83358:
+; X86:   # %bb.0:
+; X86-NEXT:vcvtneps2bf16y {{\.?LCPI[0-9]+_[0-9]+}}, %xmm0 # encoding: 
[0x62,0xf2,0x7e,0x28,0x72,0x05,A,A,A,A]
+; X86-NEXT:# fixup A - offset: 6, value: {{\.?LCPI[0-9]+_[0-9]+}}, kind: 
FK_Data_4
+; X86-NEXT:vshufi64x2 $0, %zmm0, %zmm0, %zmm0 # encoding: 
[0x62,0xf3,0xfd,0x48,0x43,0xc0,0x00]
+; X86-NEXT:# zmm0 = zmm0[0,1,0,1,0,1,0,1]
+; X86-NEXT:retl # encoding: [0xc3]
+;
+; X64-LABEL: pr83358:
+; X64:   # %bb.0:
+; X64-NEXT:vcvtneps2bf16y {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0 # 
encoding: [0x62,0xf2,0x7e,0x28,0x72,0x05,A,A,A,A]
+; X64-NEXT:# fixup A - offset: 6, value: {{\.?LCPI[0-9]+_[0-9]+}}-4, kind: 
reloc_riprel_4byte
+; X64-NEXT:vshufi64x2 $0, %zmm0, %zmm0, %zmm0 # encoding: 
[0x62,0xf3,0xfd,0x48,0x43,0xc0,0x00]
+; X64-NEXT:# zmm0 = zmm0[0,1,0,1,0,1,0,1]
+; X64-NEXT:retq # encoding: [0xc3]
+  %1 = call <8 x bfloat> @llvm.x86.avx512bf16.cvtneps2bf16.256(<8 x float> 
)
+  %2 = bitcast <8 x bfloat> %1 to <4 x i32>
+  %3 = shufflevector <4 x i32> %2, <4 x i32> undef, <16 x i32> 
+  ret <16 x i32> %3
+}
diff --git a/llvm/test/CodeGen/X86/bfloat.ll b/llvm/test/CodeGen/X86/bfloat.ll
index f2d3c4fb34199e..cd1dba17611628 100644
--- a/llvm/test/CodeGen/X86/bfloat.ll
+++ b/llvm/test/CodeGen/X86/bfloat.ll
@@ -2423,7 +2423,6 @@ define <16 x bfloat> @fptrunc_v16f32(<16 x float> %a) 
nounwind {
 ; AVXNC-LABEL: fptrunc_v16f32:
 ; AVXNC:   # %bb.0:
 ; AVXNC-NEXT:{vex} vcvtneps2bf16 %ymm0, %xmm0
-; AVXNC-NEXT:vinsertf128 $0, %xmm0, %ymm0, %ymm0
 ; AVXNC-NEXT:{vex} vcvtneps2bf16 %ymm1, %xmm1
 ; AVXNC-NEXT:vinsertf128 $1, %xmm1, %ymm0, %ymm0
 ; AVXNC-NEXT:retq

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [Codegen][X86] Fix /HOTPATCH with clang-cl and inline asm (#87639) (PR #88388)

2024-04-11 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang approved this pull request.

LGTM.

https://github.com/llvm/llvm-project/pull/88388
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86] Fix miscompile in combineShiftRightArithmetic (PR #86728)

2024-04-16 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

> @phoebewang What do you think about backporting this?

I didn't review on it. Maybe @topperc can evaluate it.

https://github.com/llvm/llvm-project/pull/86728
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x [X86_64] fix SSE type error in vaarg (PR #86698)

2024-04-16 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

> @phoebewang What do you think about backporting this?

I think the patch doesn't fix all problem in #86371, suggest to reevaluate it. 
@efriedma-quic may take another look.

https://github.com/llvm/llvm-project/pull/86698
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][EVEX512] Check hasEVEX512 for canExtendTo512DQ (#90390) (PR #90422)

2024-05-02 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

> Hi @phoebewang (or anyone else). If you would like to add a note about this 
> fix in the release notes (completely optional). Please reply to this comment 
> with a one or two sentence description of the fix. When you are done, please 
> add the release:note label to this PR.

This patch fixes a X86 bug introduced during LLVM18, which crashes when 
compiling some bit vector with AVX512.

https://github.com/llvm/llvm-project/pull/90422
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) (PR #91425)

2024-05-07 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang milestoned 
https://github.com/llvm/llvm-project/pull/91425
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) (PR #91425)

2024-05-07 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang created 
https://github.com/llvm/llvm-project/pull/91425

None

>From 5f3651376c051c1fb7b29741778a4616811a1157 Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Mon, 6 May 2024 10:59:44 +0800
Subject: [PATCH 1/2] [X86][FP16] Do not create VBROADCAST_LOAD for f16 without
 AVX2 (#91125)

AVX doesn't provide 16-bit BROADCAST instruction.

Fixes #91005
---
 llvm/lib/Target/X86/X86ISelLowering.cpp |  2 +-
 llvm/test/CodeGen/X86/pr91005.ll| 39 +
 2 files changed, 40 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/X86/pr91005.ll

diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 71fc6b5047eaa..2752f8a92447c 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -7295,7 +7295,7 @@ static SDValue 
lowerBuildVectorAsBroadcast(BuildVectorSDNode *BVOp,
 // With pattern matching, the VBROADCAST node may become a VMOVDDUP.
 if (ScalarSize == 32 ||
 (ScalarSize == 64 && (IsGE256 || Subtarget.hasVLX())) ||
-CVT == MVT::f16 ||
+(CVT == MVT::f16 && Subtarget.hasAVX2()) ||
 (OptForSize && (ScalarSize == 64 || Subtarget.hasAVX2( {
   const Constant *C = nullptr;
   if (ConstantSDNode *CI = dyn_cast(Ld))
diff --git a/llvm/test/CodeGen/X86/pr91005.ll b/llvm/test/CodeGen/X86/pr91005.ll
new file mode 100644
index 0..97fd1ce456882
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pr91005.ll
@@ -0,0 +1,39 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
+; RUN: llc -mtriple=x86_64-unknown-unknown -mattr=+f16c < %s | FileCheck %s
+
+define void @PR91005(ptr %0) minsize {
+; CHECK-LABEL: PR91005:
+; CHECK:   # %bb.0:
+; CHECK-NEXT:xorl %eax, %eax
+; CHECK-NEXT:testb %al, %al
+; CHECK-NEXT:je .LBB0_2
+; CHECK-NEXT:  # %bb.1:
+; CHECK-NEXT:vbroadcastss {{.*#+}} xmm0 = [31744,31744,31744,31744]
+; CHECK-NEXT:vpcmpeqw %xmm0, %xmm0, %xmm0
+; CHECK-NEXT:vpinsrw $0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm1
+; CHECK-NEXT:vpand %xmm1, %xmm0, %xmm0
+; CHECK-NEXT:vcvtph2ps %xmm0, %xmm0
+; CHECK-NEXT:vpxor %xmm1, %xmm1, %xmm1
+; CHECK-NEXT:vmulss %xmm1, %xmm0, %xmm0
+; CHECK-NEXT:vcvtps2ph $4, %xmm0, %xmm0
+; CHECK-NEXT:vmovd %xmm0, %eax
+; CHECK-NEXT:movw %ax, (%rdi)
+; CHECK-NEXT:  .LBB0_2: # %common.ret
+; CHECK-NEXT:retq
+  %2 = bitcast <2 x half> poison to <2 x i16>
+  %3 = icmp eq <2 x i16> %2, 
+  br i1 poison, label %4, label %common.ret
+
+common.ret:   ; preds = %4, %1
+  ret void
+
+4:; preds = %1
+  %5 = select <2 x i1> %3, <2 x half> , <2 x half> 
zeroinitializer
+  %6 = fmul <2 x half> %5, zeroinitializer
+  %7 = fsub <2 x half> %6, zeroinitializer
+  %8 = extractelement <2 x half> %7, i64 0
+  store half %8, ptr %0, align 2
+  br label %common.ret
+}
+
+declare <2 x half> @llvm.fabs.v2f16(<2 x half>)

>From 2f7c021c62c949d904510917220004b2be86e127 Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Wed, 8 May 2024 10:59:31 +0800
Subject: [PATCH 2/2] Fix difference with LLVM 18 release

---
 llvm/test/CodeGen/X86/pr91005.ll | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/llvm/test/CodeGen/X86/pr91005.ll b/llvm/test/CodeGen/X86/pr91005.ll
index 97fd1ce456882..16b78bf1e7e17 100644
--- a/llvm/test/CodeGen/X86/pr91005.ll
+++ b/llvm/test/CodeGen/X86/pr91005.ll
@@ -8,12 +8,13 @@ define void @PR91005(ptr %0) minsize {
 ; CHECK-NEXT:testb %al, %al
 ; CHECK-NEXT:je .LBB0_2
 ; CHECK-NEXT:  # %bb.1:
-; CHECK-NEXT:vbroadcastss {{.*#+}} xmm0 = [31744,31744,31744,31744]
-; CHECK-NEXT:vpcmpeqw %xmm0, %xmm0, %xmm0
-; CHECK-NEXT:vpinsrw $0, {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm1
-; CHECK-NEXT:vpand %xmm1, %xmm0, %xmm0
+; CHECK-NEXT:vpcmpeqw {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
+; CHECK-NEXT:vpand {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm0, %xmm0
+; CHECK-NEXT:vpextrw $0, %xmm0, %eax
+; CHECK-NEXT:movzwl %ax, %eax
+; CHECK-NEXT:vmovd %eax, %xmm0
 ; CHECK-NEXT:vcvtph2ps %xmm0, %xmm0
-; CHECK-NEXT:vpxor %xmm1, %xmm1, %xmm1
+; CHECK-NEXT:vxorps %xmm1, %xmm1, %xmm1
 ; CHECK-NEXT:vmulss %xmm1, %xmm0, %xmm0
 ; CHECK-NEXT:vcvtps2ph $4, %xmm0, %xmm0
 ; CHECK-NEXT:vmovd %xmm0, %eax

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) (PR #91161)

2024-05-07 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang closed 
https://github.com/llvm/llvm-project/pull/91161
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) (PR #91161)

2024-05-07 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

The test failures are caused by LLVM 18 branch difference, created #91425 
instead.

https://github.com/llvm/llvm-project/pull/91161
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][EVEX512] Add `HasEVEX512` when `NoVLX` used for 512-bit patterns (#91106) (PR #91118)

2024-05-09 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

> @phoebewang (or anyone else). If you would like to add a note about this fix 
> in the release notes (completely optional). Please reply to this comment with 
> a one or two sentence description of the fix. When you are done, please add 
> the release:note label to this PR.

This patch fixes compiler crush when user specifies `-mno-evex512` with AVX512 
features but no AVX512VL.

https://github.com/llvm/llvm-project/pull/91118
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][FP16] Do not create VBROADCAST_LOAD for f16 without AVX2 (#91125) (PR #91425)

2024-05-09 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

> @phoebewang (or anyone else). If you would like to add a note about this fix 
> in the release notes (completely optional). Please reply to this comment with 
> a one or two sentence description of the fix. When you are done, please add 
> the release:note label to this PR.

This patch fixes a bug that tries to do VBROADCAST_LOAD for f16 without AVX2.

https://github.com/llvm/llvm-project/pull/91425
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86][Driver] Do not add `-evex512` for `-march=native` when the target doesn't support AVX512 (#91694) (PR #91705)

2024-05-13 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

This patch fixes build failures when compiling AVX512 code using 
`-march=native` on machines without AVX512. The problem was introduced by 
https://github.com/llvm/llvm-project/commit/a7b8b890600a33e0c88d639f311f1d73ccb1c8d2
 which is included in LLVM 18.1.5 release.

https://github.com/llvm/llvm-project/pull/91705
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] 5237193 - [NFC] Fix typos in comments

2023-11-29 Thread Phoebe Wang via llvm-branch-commits

Author: Phoebe Wang
Date: 2023-11-19T10:14:34+08:00
New Revision: 5237193b87721134541f228e28edfd544a9c8ac8

URL: 
https://github.com/llvm/llvm-project/commit/5237193b87721134541f228e28edfd544a9c8ac8
DIFF: 
https://github.com/llvm/llvm-project/commit/5237193b87721134541f228e28edfd544a9c8ac8.diff

LOG: [NFC] Fix typos in comments

Added: 


Modified: 
clang/lib/CodeGen/CodeGenFunction.cpp

Removed: 




diff  --git a/clang/lib/CodeGen/CodeGenFunction.cpp 
b/clang/lib/CodeGen/CodeGenFunction.cpp
index 64521ce7182eee6..2199d7b58fb96e6 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -495,12 +495,12 @@ void CodeGenFunction::FinishFunction(SourceLocation 
EndLoc) {
   if (CurFnInfo->getMaxVectorWidth() > LargestVectorWidth)
 LargestVectorWidth = CurFnInfo->getMaxVectorWidth();
 
-  // Add the required-vector-width attribute. This contains the max width from:
+  // Add the min-legal-vector-width attribute. This contains the max width 
from:
   // 1. min-vector-width attribute used in the source program.
   // 2. Any builtins used that have a vector width specified.
   // 3. Values passed in and out of inline assembly.
   // 4. Width of vector arguments and return types for this function.
-  // 5. Width of vector aguments and return types for functions called by this
+  // 5. Width of vector arguments and return types for functions called by this
   //function.
   if (getContext().getTargetInfo().getTriple().isX86())
 CurFn->addFnAttr("min-legal-vector-width",



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79279 (PR #79341)

2024-01-24 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

> @phoebewang What do you think about merging this PR to the release branch?

The fix is required and has low risk, LGTM.

https://github.com/llvm/llvm-project/pull/79341
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79675 (PR #79721)

2024-01-27 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

> @phoebewang What do you think about merging this PR to the release branch?

This patch fixes prior mistakes, so should be merged to release branch. 
@KanRobert Do I understand it right?

https://github.com/llvm/llvm-project/pull/79721
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [X86] Use correct fp immediate types in _mm_set_ss/sd (PR #105638)

2024-08-22 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang approved this pull request.

LGTM, thanks!

https://github.com/llvm/llvm-project/pull/105638
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [MCA][X86] Add missing 512-bit vpscatterqd/vscatterqps schedule data (REAPPLIED) (PR #105815)

2024-08-23 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang approved this pull request.

LGTM.

https://github.com/llvm/llvm-project/pull/105815
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)

2024-09-02 Thread Phoebe Wang via llvm-branch-commits


@@ -0,0 +1,17 @@
+; RUN: llc -verify-machineinstrs < %s -relocation-model=pic
+
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"

phoebewang wrote:

This can put in the RUN line like
`; RUN: llc -mtriple=x86_64 -verify-machineinstrs < %s -relocation-model=pic | 
FileCheck %s`

And add FileCheck to show the assembly.

https://github.com/llvm/llvm-project/pull/106965
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)

2024-09-02 Thread Phoebe Wang via llvm-branch-commits


@@ -0,0 +1,17 @@
+; RUN: llc -verify-machineinstrs < %s -relocation-model=pic
+
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"

phoebewang wrote:

Do not need this.

https://github.com/llvm/llvm-project/pull/106965
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)

2024-09-02 Thread Phoebe Wang via llvm-branch-commits


@@ -0,0 +1,17 @@
+; RUN: llc -verify-machineinstrs < %s -relocation-model=pic
+
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; Passing a pointer to thread-local storage to a function can be problematic
+; since computing such addresses requires a function call that is introduced
+; very late in instruction selection. We need to ensure that we don't introduce
+; nested call sequence markers if this function call happens in a call 
sequence.
+
+@TLS = internal thread_local global i64 zeroinitializer, align 8
+declare void @bar(ptr)
+define internal void @foo() {
+call void @bar(ptr @TLS)
+call void @bar(ptr @TLS)
+ret void

phoebewang wrote:

Two spaces indentation.

https://github.com/llvm/llvm-project/pull/106965
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)

2024-09-03 Thread Phoebe Wang via llvm-branch-commits


@@ -0,0 +1,17 @@
+; RUN: llc -verify-machineinstrs < %s -relocation-model=pic
+
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"

phoebewang wrote:

Yes, we need to check the functional correctness.

https://github.com/llvm/llvm-project/pull/106965
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] e7b1dc9 - [X86][NFC] Pre-commit test to show prolog insert problem

2021-11-12 Thread Phoebe Wang via llvm-branch-commits

Author: Phoebe Wang
Date: 2021-11-13T11:30:09+08:00
New Revision: e7b1dc9a9dbabf0a13a76d44b179c0230222

URL: 
https://github.com/llvm/llvm-project/commit/e7b1dc9a9dbabf0a13a76d44b179c0230222
DIFF: 
https://github.com/llvm/llvm-project/commit/e7b1dc9a9dbabf0a13a76d44b179c0230222.diff

LOG: [X86][NFC] Pre-commit test to show prolog insert problem

Added: 
llvm/test/CodeGen/X86/vaargs-prolog-insert.ll

Modified: 


Removed: 




diff  --git a/llvm/test/CodeGen/X86/vaargs-prolog-insert.ll 
b/llvm/test/CodeGen/X86/vaargs-prolog-insert.ll
new file mode 100644
index 0..952a9e2d8b4e2
--- /dev/null
+++ b/llvm/test/CodeGen/X86/vaargs-prolog-insert.ll
@@ -0,0 +1,45 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=x86_64 < %s | FileCheck %s
+
+; Check the prolog won't be sunk across the save of CSRs.
+define void @reduce(i32, i32, i32, i32, i32, i32, ...) nounwind {
+; CHECK-LABEL: reduce:
+; CHECK:   # %bb.0:
+; CHECK-NEXT:testb %al, %al
+; CHECK-NEXT:je .LBB0_4
+; CHECK-NEXT:  # %bb.3:
+; CHECK-NEXT:movaps %xmm0, -{{[0-9]+}}(%rsp)
+; CHECK-NEXT:movaps %xmm1, -{{[0-9]+}}(%rsp)
+; CHECK-NEXT:movaps %xmm2, -{{[0-9]+}}(%rsp)
+; CHECK-NEXT:movaps %xmm3, -{{[0-9]+}}(%rsp)
+; CHECK-NEXT:movaps %xmm4, -{{[0-9]+}}(%rsp)
+; CHECK-NEXT:movaps %xmm5, (%rsp)
+; CHECK-NEXT:movaps %xmm6, {{[0-9]+}}(%rsp)
+; CHECK-NEXT:movaps %xmm7, {{[0-9]+}}(%rsp)
+; CHECK-NEXT:  .LBB0_4:
+; CHECK-NEXT:xorl %eax, %eax
+; CHECK-NEXT:testb %al, %al
+; CHECK-NEXT:jne .LBB0_2
+; CHECK-NEXT:  # %bb.1:
+; CHECK-NEXT:subq $56, %rsp
+; CHECK-NEXT:leaq -{{[0-9]+}}(%rsp), %rax
+; CHECK-NEXT:movq %rax, 16
+; CHECK-NEXT:leaq {{[0-9]+}}(%rsp), %rax
+; CHECK-NEXT:movq %rax, 8
+; CHECK-NEXT:movl $48, 4
+; CHECK-NEXT:movl $48, 0
+; CHECK-NEXT:addq $56, %rsp
+; CHECK-NEXT:  .LBB0_2:
+; CHECK-NEXT:retq
+  br i1 undef, label %8, label %7
+
+7:; preds = %6
+  call void @llvm.va_start(i8* null)
+  br label %8
+
+8:; preds = %7, %6
+  ret void
+}
+
+declare void @llvm.va_start(i8*)
+declare void @llvm.va_end(i8*)



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] a566a4b - [X86][VARARG] Assign MMO earlier to avoid prolog insert point been sunk across VASTART_SAVE_XMM_REGS

2021-11-12 Thread Phoebe Wang via llvm-branch-commits

Author: Phoebe Wang
Date: 2021-11-13T11:32:57+08:00
New Revision: a566a4b21157e7aedb6d4230f61e09f6cd3152ce

URL: 
https://github.com/llvm/llvm-project/commit/a566a4b21157e7aedb6d4230f61e09f6cd3152ce
DIFF: 
https://github.com/llvm/llvm-project/commit/a566a4b21157e7aedb6d4230f61e09f6cd3152ce.diff

LOG: [X86][VARARG] Assign MMO earlier to avoid prolog insert point been sunk 
across VASTART_SAVE_XMM_REGS

The changes in D80163 defered the assignment of MachineMemOperand (MMO)
until the X86ExpandPseudo pass. This will result in crash due to prolog
insert point been sunk across the pseudo instruction VASTART_SAVE_XMM_REGS.

Moving the assignment to the creation of the node can avoid the problem.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D112859

Added: 


Modified: 
llvm/lib/Target/X86/X86ExpandPseudo.cpp
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/lib/Target/X86/X86ISelLowering.h
llvm/lib/Target/X86/X86InstrCompiler.td
llvm/lib/Target/X86/X86InstrInfo.td
llvm/test/CodeGen/X86/vaargs-prolog-insert.ll

Removed: 




diff  --git a/llvm/lib/Target/X86/X86ExpandPseudo.cpp 
b/llvm/lib/Target/X86/X86ExpandPseudo.cpp
index 4add8d30e010e..65ffe6621545b 100644
--- a/llvm/lib/Target/X86/X86ExpandPseudo.cpp
+++ b/llvm/lib/Target/X86/X86ExpandPseudo.cpp
@@ -657,35 +657,24 @@ void X86ExpandPseudo::ExpandVastartSaveXmmRegs(
   EntryBlk->end());
   TailBlk->transferSuccessorsAndUpdatePHIs(EntryBlk);
 
-  int64_t FrameIndex = VAStartPseudoInstr->getOperand(1).getImm();
-  Register BaseReg;
-  uint64_t FrameOffset =
-  X86FL->getFrameIndexReference(*Func, FrameIndex, BaseReg).getFixed();
-  uint64_t VarArgsRegsOffset = VAStartPseudoInstr->getOperand(2).getImm();
+  uint64_t FrameOffset = VAStartPseudoInstr->getOperand(4).getImm();
+  uint64_t VarArgsRegsOffset = VAStartPseudoInstr->getOperand(6).getImm();
 
   // TODO: add support for YMM and ZMM here.
   unsigned MOVOpc = STI->hasAVX() ? X86::VMOVAPSmr : X86::MOVAPSmr;
 
   // In the XMM save block, save all the XMM argument registers.
-  for (int64_t OpndIdx = 3, RegIdx = 0;
+  for (int64_t OpndIdx = 7, RegIdx = 0;
OpndIdx < VAStartPseudoInstr->getNumOperands() - 1;
OpndIdx++, RegIdx++) {
-
-int64_t Offset = FrameOffset + VarArgsRegsOffset + RegIdx * 16;
-
-MachineMemOperand *MMO = Func->getMachineMemOperand(
-MachinePointerInfo::getFixedStack(*Func, FrameIndex, Offset),
-MachineMemOperand::MOStore,
-/*Size=*/16, Align(16));
-
-BuildMI(GuardedRegsBlk, DL, TII->get(MOVOpc))
-.addReg(BaseReg)
-.addImm(/*Scale=*/1)
-.addReg(/*IndexReg=*/0)
-.addImm(/*Disp=*/Offset)
-.addReg(/*Segment=*/0)
-.addReg(VAStartPseudoInstr->getOperand(OpndIdx).getReg())
-.addMemOperand(MMO);
+auto NewMI = BuildMI(GuardedRegsBlk, DL, TII->get(MOVOpc));
+for (int i = 0; i < X86::AddrNumOperands; ++i) {
+  if (i == X86::AddrDisp)
+NewMI.addImm(FrameOffset + VarArgsRegsOffset + RegIdx * 16);
+  else
+NewMI.add(VAStartPseudoInstr->getOperand(i + 1));
+}
+NewMI.addReg(VAStartPseudoInstr->getOperand(OpndIdx).getReg());
 assert(Register::isPhysicalRegister(
 VAStartPseudoInstr->getOperand(OpndIdx).getReg()));
   }

diff  --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 032db2a80a77e..f3b1e6ca70ad1 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -3533,13 +3533,19 @@ void 
VarArgsLoweringHelper::createVarArgAreaAndStoreRegisters(
   SmallVector SaveXMMOps;
   SaveXMMOps.push_back(Chain);
   SaveXMMOps.push_back(ALVal);
-  SaveXMMOps.push_back(
-  DAG.getTargetConstant(FuncInfo->getRegSaveFrameIndex(), DL, 
MVT::i32));
+  SaveXMMOps.push_back(RSFIN);
   SaveXMMOps.push_back(
   DAG.getTargetConstant(FuncInfo->getVarArgsFPOffset(), DL, MVT::i32));
   llvm::append_range(SaveXMMOps, LiveXMMRegs);
-  MemOps.push_back(DAG.getNode(X86ISD::VASTART_SAVE_XMM_REGS, DL,
-   MVT::Other, SaveXMMOps));
+  MachineMemOperand *StoreMMO =
+  DAG.getMachineFunction().getMachineMemOperand(
+  MachinePointerInfo::getFixedStack(
+  DAG.getMachineFunction(), FuncInfo->getRegSaveFrameIndex(),
+  Offset),
+  MachineMemOperand::MOStore, 128, Align(16));
+  MemOps.push_back(DAG.getMemIntrinsicNode(X86ISD::VASTART_SAVE_XMM_REGS,
+   DL, DAG.getVTList(MVT::Other),
+   SaveXMMOps, MVT::i8, StoreMMO));
 }
 
 if (!MemOps.empty())

diff  --git a/llvm/lib/Target/X86/X86ISelLowering.h 
b/llvm/lib/Target/X86/X86ISelLowering.h
index 869857bcc0d61..8b18b5981e86f 100644
--- a/llvm/lib/Target/X86/

[llvm-branch-commits] [llvm] e4b3fad - [clang][llvm][doc] Add more information for the ABI change in FP16

2022-08-04 Thread Phoebe Wang via llvm-branch-commits

Author: Phoebe Wang
Date: 2022-08-04T22:31:16+08:00
New Revision: e4b3fad1facdb3f005ec77f607a2b05e3e9fcbad

URL: 
https://github.com/llvm/llvm-project/commit/e4b3fad1facdb3f005ec77f607a2b05e3e9fcbad
DIFF: 
https://github.com/llvm/llvm-project/commit/e4b3fad1facdb3f005ec77f607a2b05e3e9fcbad.diff

LOG: [clang][llvm][doc] Add more information for the ABI change in FP16

Differential Revision: https://reviews.llvm.org/D131172

Added: 


Modified: 
clang/docs/ReleaseNotes.rst
llvm/docs/ReleaseNotes.rst

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 84c74335ea5d1..41e9e09d6b436 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -619,6 +619,19 @@ X86 Support in Clang
   will be used by Linux kernel mitigations for RETBLEED. The corresponding flag
   ``-mfunction-return=keep`` may be appended to disable the feature.
 
+The ``_Float16`` type requires SSE2 feature and above due to the instruction
+limitations. When using it on i386 targets, you need to specify ``-msse2``
+explicitly.
+
+For targets without F16C feature or above, please make sure:
+
+- Use GCC 12.0 and above if you are using libgcc.
+- If you are using compiler-rt, use the same version with the compiler.
+Early versions provided FP16 builtins in a 
diff erent ABI. A workaround is to use
+a small code snippet to check the ABI if you cannot make sure of it.
+- If you are using downstream runtimes that provide FP16 conversions, update
+them with the new ABI.
+
 DWARF Support in Clang
 --
 

diff  --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index cf41439d8dd74..d94687263bb3f 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -186,9 +186,25 @@ Changes to the WebAssembly Backend
 Changes to the X86 Backend
 --
 
-* Support ``half`` type on SSE2 and above targets.
+* Support ``half`` type on SSE2 and above targets following X86 psABI.
 * Support ``rdpru`` instruction on Zen2 and above targets.
 
+During this release, ``half`` type has an ABI breaking change to provide the
+support for the ABI of ``_Float16`` type on SSE2 and above following X86 psABI.
+(`D107082 `_)
+
+The change may affect the current use of ``half`` includes (but is not limited
+to):
+
+* Frontends generating ``half`` type in function passing and/or returning
+arguments.
+* Downstream runtimes providing any ``half`` conversion builtins assuming the
+old ABI.
+* Projects built with LLVM 15.0 but using early versions of compiler-rt.
+
+When you find failures with ``half`` type, check the calling conversion of the
+code and switch it to the new ABI.
+
 Changes to the OCaml bindings
 -
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [x86] combineMUL - when looking for a vector multiply by splat constant, ensure we're only accepting ConstantInt splat scalars. (PR #111246)

2024-10-11 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

LGTM.

https://github.com/llvm/llvm-project/pull/111246
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86][APX] Fix wrong encoding of promoted KMOV instructions due to missing NoCD8 (#109579) (PR #109635)

2024-09-23 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang created 
https://github.com/llvm/llvm-project/pull/109635

Promoted KMOV* was encoded with CD8 incorrectly, see 
https://godbolt.org/z/cax513hG1

>From b403d2a05b548f24b46bab4c4ae014c9949f3c44 Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Mon, 23 Sep 2024 09:41:43 +0800
Subject: [PATCH] [X86][APX] Fix wrong encoding of promoted KMOV instructions
 due to missing NoCD8 (#109579)

Promoted KMOV* was encoded with CD8 incorrectly, see
https://godbolt.org/z/cax513hG1
---
 llvm/lib/Target/X86/X86InstrAVX512.td  | 27 +++---
 llvm/test/MC/Disassembler/X86/apx/kmov.txt | 16 +
 llvm/test/MC/X86/apx/kmov-att.s| 14 ++-
 llvm/test/MC/X86/apx/kmov-intel.s  | 12 ++
 4 files changed, 55 insertions(+), 14 deletions(-)

diff --git a/llvm/lib/Target/X86/X86InstrAVX512.td 
b/llvm/lib/Target/X86/X86InstrAVX512.td
index da690aea43f5c0..1bf201f2bb87e4 100644
--- a/llvm/lib/Target/X86/X86InstrAVX512.td
+++ b/llvm/lib/Target/X86/X86InstrAVX512.td
@@ -2617,19 +2617,20 @@ defm VFPCLASS : avx512_fp_fpclass_all<"vfpclass", 0x66, 
0x67, SchedWriteFCmp>, E
 multiclass avx512_mask_mov opc_kk, bits<8> opc_km, bits<8> opc_mk,
   string OpcodeStr, RegisterClass KRC, ValueType vvt,
   X86MemOperand x86memop, string Suffix = ""> {
-  let isMoveReg = 1, hasSideEffects = 0, SchedRW = [WriteMove],
-  explicitOpPrefix = !if(!eq(Suffix, ""), NoExplicitOpPrefix, 
ExplicitEVEX) in
-  def kk#Suffix : I,
-  Sched<[WriteMove]>;
-  def km#Suffix : I,
-  Sched<[WriteLoad]>;
-  def mk#Suffix : I,
-  Sched<[WriteStore]>;
+  let explicitOpPrefix = !if(!eq(Suffix, ""), NoExplicitOpPrefix, 
ExplicitEVEX) in {
+let isMoveReg = 1, hasSideEffects = 0, SchedRW = [WriteMove] in
+def kk#Suffix : I,
+Sched<[WriteMove]>;
+def km#Suffix : I,
+Sched<[WriteLoad]>, NoCD8;
+def mk#Suffix : I,
+Sched<[WriteStore]>, NoCD8;
+  }
 }
 
 multiclass avx512_mask_mov_gpr opc_kr, bits<8> opc_rk,
diff --git a/llvm/test/MC/Disassembler/X86/apx/kmov.txt 
b/llvm/test/MC/Disassembler/X86/apx/kmov.txt
index 5d947ff39f2314..45fedbd0da587b 100644
--- a/llvm/test/MC/Disassembler/X86/apx/kmov.txt
+++ b/llvm/test/MC/Disassembler/X86/apx/kmov.txt
@@ -17,6 +17,22 @@
 # INTEL: {evex} kmovq  k2, k1
 0x62,0xf1,0xfc,0x08,0x90,0xd1
 
+# ATT:   {evex} kmovb   -16(%rax), %k0
+# INTEL: {evex} kmovb   k0, byte ptr [rax - 16]
+0x62,0xf1,0x7d,0x08,0x90,0x40,0xf0
+
+# ATT:   {evex} kmovw   -16(%rax), %k0
+# INTEL: {evex} kmovw   k0, word ptr [rax - 16]
+0x62,0xf1,0x7c,0x08,0x90,0x40,0xf0
+
+# ATT:   {evex} kmovd   -16(%rax), %k0
+# INTEL: {evex} kmovd   k0, dword ptr [rax - 16]
+0x62,0xf1,0xfd,0x08,0x90,0x40,0xf0
+
+# ATT:   {evex} kmovq   -16(%rax), %k0
+# INTEL: {evex} kmovq   k0, qword ptr [rax - 16]
+0x62,0xf1,0xfc,0x08,0x90,0x40,0xf0
+
 # ATT-NOT: {evex}
 # INTEL-NOT: {evex}
 
diff --git a/llvm/test/MC/X86/apx/kmov-att.s b/llvm/test/MC/X86/apx/kmov-att.s
index 949ef65be98d4c..5f59e0a505b235 100644
--- a/llvm/test/MC/X86/apx/kmov-att.s
+++ b/llvm/test/MC/X86/apx/kmov-att.s
@@ -1,7 +1,7 @@
 # RUN: llvm-mc -triple x86_64 -show-encoding %s | FileCheck %s
 # RUN: not llvm-mc -triple i386 -show-encoding %s 2>&1 | FileCheck %s 
--check-prefix=ERROR
 
-# ERROR-COUNT-20: error:
+# ERROR-COUNT-24: error:
 # ERROR-NOT: error:
 # CHECK: {evex}kmovb   %k1, %k2
 # CHECK: encoding: [0x62,0xf1,0x7d,0x08,0x90,0xd1]
@@ -15,6 +15,18 @@
 # CHECK: {evex}kmovq   %k1, %k2
 # CHECK: encoding: [0x62,0xf1,0xfc,0x08,0x90,0xd1]
  {evex}kmovq   %k1, %k2
+# CHECK: {evex} kmovb   -16(%rax), %k0
+# CHECK: encoding: [0x62,0xf1,0x7d,0x08,0x90,0x40,0xf0]
+ {evex} kmovb   -0x10(%rax), %k0
+# CHECK: {evex} kmovw   -16(%rax), %k0
+# CHECK: encoding: [0x62,0xf1,0x7c,0x08,0x90,0x40,0xf0]
+ {evex} kmovw   -0x10(%rax), %k0
+# CHECK: {evex} kmovd   -16(%rax), %k0
+# CHECK: encoding: [0x62,0xf1,0xfd,0x08,0x90,0x40,0xf0]
+ {evex} kmovd   -0x10(%rax), %k0
+# CHECK: {evex} kmovq   -16(%rax), %k0
+# CHECK: encoding: [0x62,0xf1,0xfc,0x08,0x90,0x40,0xf0]
+ {evex} kmovq   -0x10(%rax), %k0
 
 # CHECK-NOT: {evex}
 
diff --git a/llvm/test/MC/X86/apx/kmov-intel.s 
b/llvm/test/MC/X86/apx/kmov-intel.s
index 0cdbd310062eba..51cec67caf9a04 100644
--- a/llvm/test/MC/X86/apx/kmov-intel.s
+++ b/llvm/test/MC/X86/apx/kmov-intel.s
@@ -12,6 +12,18 @@
 # CHECK: {evex}kmovq   k2, k1
 # CHECK: encoding: [0x62,0xf1,0xfc,0x08,0x90,0xd1]
  {evex}kmovq   k2, k1
+# CHECK: {evex} kmovb   k0, byte ptr [rax - 16]
+# CHECK: encoding: [0x62,0xf1,0x7d,0x08,0x90,0x40,0xf0]
+ {evex} kmovb   k0, byte ptr [rax - 0x10]
+# CHECK: {evex} kmovw   k0, word ptr [rax - 16]
+# CHECK: encoding: [0x62,0xf1,0x7c,0x08,0x90,0x40,0xf0]
+ {evex} kmovw   k0, word ptr [rax - 0x10]
+# CHECK: {evex} kmovd   k0, dword ptr [rax -

[llvm-branch-commits] [llvm] [X86][APX] Fix wrong encoding of promoted KMOV instructions due to missing NoCD8 (#109579) (PR #109635)

2024-09-23 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang milestoned 
https://github.com/llvm/llvm-project/pull/109635
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [X86][APX] Fix wrong encoding of promoted KMOV instructions due to missing NoCD8 (#109759) (PR #109767)

2024-09-24 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang edited 
https://github.com/llvm/llvm-project/pull/109767
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86][APX] Fix wrong encoding of promoted KMOV instructions due to missing NoCD8 (#109579) (PR #109635)

2024-09-24 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

Sorry, there was a mistake with the patch. I'll close it and create another one.

https://github.com/llvm/llvm-project/pull/109635
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86][APX] Fix wrong encoding of promoted KMOV instructions due to missing NoCD8 (#109579) (PR #109635)

2024-09-24 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang closed 
https://github.com/llvm/llvm-project/pull/109635
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [X86][APX] Fix wrong encoding of promoted KMOV instructions due to missing NoCD8 (#109759) (PR #109767)

2024-10-01 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

Thanks @tru! The title should be good as the description.

https://github.com/llvm/llvm-project/pull/109767
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [X86][AVX10] Disable m[no-]avx10.1 and switch m[no-]avx10.2 to alias of 512 bit options (#124511) (PR #125057)

2025-02-04 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

@e-kud Please help review it.

> It had the 'needs review' status and I missed it. We can get it into -rc2.

Oh, I thought it notified reviewer automaticly. Maybe because it's manully 
cherry-pick.

https://github.com/llvm/llvm-project/pull/125057
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/20.x: [AVX10.2] Fix wrong intrinsic names after rename (#126390) (PR #126687)

2025-02-10 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

This is a bug fix without risk, LGTM.

https://github.com/llvm/llvm-project/pull/126687
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [X86][AVX10] Disable m[no-]avx10.1 and switch m[no-]avx10.2 to alias of 512 bit options (#124511) (PR #125057)

2025-02-01 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang updated 
https://github.com/llvm/llvm-project/pull/125057

>From f816bd39f6986825e338198fce8747939ab1c882 Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Thu, 30 Jan 2025 21:13:49 +0800
Subject: [PATCH] [X86][AVX10] Disable m[no-]avx10.1 and switch m[no-]avx10.2
 to alias of 512 bit options (#124511)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Per the feedback we got, we’d like to switch m[no-]avx10.2 to alias of
512 bit options and disable m[no-]avx10.1 due to they were alias of 256
bit options.

We also change -mno-avx10.[1,2]-512 to alias of 256 bit options to
disable both 256 and 512 instructions.
---
 clang/docs/ReleaseNotes.rst   |  4 
 clang/include/clang/Driver/Options.td | 12 +---
 clang/lib/Driver/ToolChains/Arch/X86.cpp  | 13 -
 clang/test/Driver/x86-target-features.c   | 16 ++--
 clang/test/Preprocessor/x86_target_features.c |  3 +--
 5 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index d8a94703bd9c57..c36a84c2b8362b 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1159,6 +1159,10 @@ X86 Support
 - Support ISA of ``MOVRS``.
 
 - Supported ``-march/tune=diamondrapids``
+- Disable ``-m[no-]avx10.1`` and switch ``-m[no-]avx10.2`` to alias of 512 bit
+  options.
+- Change ``-mno-avx10.1-512`` to alias of ``-mno-avx10.1-256`` to disable both
+  256 and 512 bit instructions.
 
 Arm and AArch64 Support
 ^^^
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 1af633e59d0bba..a2b47b943ef90d 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -6441,15 +6441,13 @@ def mno_avx : Flag<["-"], "mno-avx">, 
Group;
 def mavx10_1_256 : Flag<["-"], "mavx10.1-256">, 
Group;
 def mno_avx10_1_256 : Flag<["-"], "mno-avx10.1-256">, 
Group;
 def mavx10_1_512 : Flag<["-"], "mavx10.1-512">, 
Group;
-def mno_avx10_1_512 : Flag<["-"], "mno-avx10.1-512">, 
Group;
-def mavx10_1 : Flag<["-"], "mavx10.1">, Alias;
-def mno_avx10_1 : Flag<["-"], "mno-avx10.1">, Alias;
+def mno_avx10_1_512 : Flag<["-"], "mno-avx10.1-512">, Alias;
+def mavx10_1 : Flag<["-"], "mavx10.1">, Flags<[Unsupported]>;
+def mno_avx10_1 : Flag<["-"], "mno-avx10.1">, Flags<[Unsupported]>;
 def mavx10_2_256 : Flag<["-"], "mavx10.2-256">, 
Group;
-def mno_avx10_2_256 : Flag<["-"], "mno-avx10.2-256">, 
Group;
 def mavx10_2_512 : Flag<["-"], "mavx10.2-512">, 
Group;
-def mno_avx10_2_512 : Flag<["-"], "mno-avx10.2-512">, 
Group;
-def mavx10_2 : Flag<["-"], "mavx10.2">, Alias;
-def mno_avx10_2 : Flag<["-"], "mno-avx10.2">, Alias;
+def mavx10_2 : Flag<["-"], "mavx10.2">, Alias;
+def mno_avx10_2 : Flag<["-"], "mno-avx10.2">, 
Group;
 def mavx2 : Flag<["-"], "mavx2">, Group;
 def mno_avx2 : Flag<["-"], "mno-avx2">, Group;
 def mavx512f : Flag<["-"], "mavx512f">, Group;
diff --git a/clang/lib/Driver/ToolChains/Arch/X86.cpp 
b/clang/lib/Driver/ToolChains/Arch/X86.cpp
index b2109e11038fe8..47c2c3e23f9fd9 100644
--- a/clang/lib/Driver/ToolChains/Arch/X86.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/X86.cpp
@@ -237,15 +237,18 @@ void x86::getX86TargetFeatures(const Driver &D, const 
llvm::Triple &Triple,
 
 bool IsNegative = Name.consume_front("no-");
 
-#ifndef NDEBUG
-assert(Name.starts_with("avx10.") && "Invalid AVX10 feature name.");
 StringRef Version, Width;
 std::tie(Version, Width) = Name.substr(6).split('-');
+assert(Name.starts_with("avx10.") && "Invalid AVX10 feature name.");
 assert((Version == "1" || Version == "2") && "Invalid AVX10 feature 
name.");
-assert((Width == "256" || Width == "512") && "Invalid AVX10 feature 
name.");
-#endif
 
-Features.push_back(Args.MakeArgString((IsNegative ? "-" : "+") + Name));
+if (Width == "") {
+  assert(IsNegative && "Only negative options can omit width.");
+  Features.push_back(Args.MakeArgString("-" + Name + "-256"));
+} else {
+  assert((Width == "256" || Width == "512") && "Invalid vector length.");
+  Features.push_back(Args.MakeArgString((IsNegative ? "-" : "+") + Name));
+}
   }
 
   // Now add any that the user explicitly requested on the command line,
diff --git a/clang/test/Driver/x86-target-features.c 
b/clang/test/Driver/x86-target-features.c
index 339f593dc760a8..18361251dcebc5 100644
--- a/clang/test/Driver/x86-target-features.c
+++ b/clang/test/Driver/x86-target-features.c
@@ -395,7 +395,8 @@
 // EVEX512: "-target-feature" "+evex512"
 // NO-EVEX512: "-target-feature" "-evex512"
 
-// RUN: %clang --target=i386 -mavx10.1 %s -### -o %t.o 2>&1 | FileCheck 
-check-prefix=AVX10_1_256 %s
+// RUN: not %clang --target=i386 -march=i386 -mavx10.1 %s -### -o %t.o 2>&1 | 
FileCheck -check-prefix=UNSUPPORT-AVX10 %s
+// RUN: not %clang --target=i386 -march=i386 -mno-avx10.1 %s -### -o %t.o 2>&1 
| FileChec

[llvm-branch-commits] [clang] [X86][AVX10] Disable m[no-]avx10.1 and switch m[no-]avx10.2 to alias of 512 bit options (#124511) (PR #125057)

2025-02-02 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

Ping @tstellar, any reason this was missing in rc1 release?

https://github.com/llvm/llvm-project/pull/125057
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [X86][AVX10] Disable m[no-]avx10.1 and switch m[no-]avx10.2 to alias of 512 bit options (#124511) (PR #125057)

2025-02-02 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang updated 
https://github.com/llvm/llvm-project/pull/125057

>From f816bd39f6986825e338198fce8747939ab1c882 Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Thu, 30 Jan 2025 21:13:49 +0800
Subject: [PATCH] [X86][AVX10] Disable m[no-]avx10.1 and switch m[no-]avx10.2
 to alias of 512 bit options (#124511)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Per the feedback we got, we’d like to switch m[no-]avx10.2 to alias of
512 bit options and disable m[no-]avx10.1 due to they were alias of 256
bit options.

We also change -mno-avx10.[1,2]-512 to alias of 256 bit options to
disable both 256 and 512 instructions.
---
 clang/docs/ReleaseNotes.rst   |  4 
 clang/include/clang/Driver/Options.td | 12 +---
 clang/lib/Driver/ToolChains/Arch/X86.cpp  | 13 -
 clang/test/Driver/x86-target-features.c   | 16 ++--
 clang/test/Preprocessor/x86_target_features.c |  3 +--
 5 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index d8a94703bd9c57..c36a84c2b8362b 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1159,6 +1159,10 @@ X86 Support
 - Support ISA of ``MOVRS``.
 
 - Supported ``-march/tune=diamondrapids``
+- Disable ``-m[no-]avx10.1`` and switch ``-m[no-]avx10.2`` to alias of 512 bit
+  options.
+- Change ``-mno-avx10.1-512`` to alias of ``-mno-avx10.1-256`` to disable both
+  256 and 512 bit instructions.
 
 Arm and AArch64 Support
 ^^^
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 1af633e59d0bba..a2b47b943ef90d 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -6441,15 +6441,13 @@ def mno_avx : Flag<["-"], "mno-avx">, 
Group;
 def mavx10_1_256 : Flag<["-"], "mavx10.1-256">, 
Group;
 def mno_avx10_1_256 : Flag<["-"], "mno-avx10.1-256">, 
Group;
 def mavx10_1_512 : Flag<["-"], "mavx10.1-512">, 
Group;
-def mno_avx10_1_512 : Flag<["-"], "mno-avx10.1-512">, 
Group;
-def mavx10_1 : Flag<["-"], "mavx10.1">, Alias;
-def mno_avx10_1 : Flag<["-"], "mno-avx10.1">, Alias;
+def mno_avx10_1_512 : Flag<["-"], "mno-avx10.1-512">, Alias;
+def mavx10_1 : Flag<["-"], "mavx10.1">, Flags<[Unsupported]>;
+def mno_avx10_1 : Flag<["-"], "mno-avx10.1">, Flags<[Unsupported]>;
 def mavx10_2_256 : Flag<["-"], "mavx10.2-256">, 
Group;
-def mno_avx10_2_256 : Flag<["-"], "mno-avx10.2-256">, 
Group;
 def mavx10_2_512 : Flag<["-"], "mavx10.2-512">, 
Group;
-def mno_avx10_2_512 : Flag<["-"], "mno-avx10.2-512">, 
Group;
-def mavx10_2 : Flag<["-"], "mavx10.2">, Alias;
-def mno_avx10_2 : Flag<["-"], "mno-avx10.2">, Alias;
+def mavx10_2 : Flag<["-"], "mavx10.2">, Alias;
+def mno_avx10_2 : Flag<["-"], "mno-avx10.2">, 
Group;
 def mavx2 : Flag<["-"], "mavx2">, Group;
 def mno_avx2 : Flag<["-"], "mno-avx2">, Group;
 def mavx512f : Flag<["-"], "mavx512f">, Group;
diff --git a/clang/lib/Driver/ToolChains/Arch/X86.cpp 
b/clang/lib/Driver/ToolChains/Arch/X86.cpp
index b2109e11038fe8..47c2c3e23f9fd9 100644
--- a/clang/lib/Driver/ToolChains/Arch/X86.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/X86.cpp
@@ -237,15 +237,18 @@ void x86::getX86TargetFeatures(const Driver &D, const 
llvm::Triple &Triple,
 
 bool IsNegative = Name.consume_front("no-");
 
-#ifndef NDEBUG
-assert(Name.starts_with("avx10.") && "Invalid AVX10 feature name.");
 StringRef Version, Width;
 std::tie(Version, Width) = Name.substr(6).split('-');
+assert(Name.starts_with("avx10.") && "Invalid AVX10 feature name.");
 assert((Version == "1" || Version == "2") && "Invalid AVX10 feature 
name.");
-assert((Width == "256" || Width == "512") && "Invalid AVX10 feature 
name.");
-#endif
 
-Features.push_back(Args.MakeArgString((IsNegative ? "-" : "+") + Name));
+if (Width == "") {
+  assert(IsNegative && "Only negative options can omit width.");
+  Features.push_back(Args.MakeArgString("-" + Name + "-256"));
+} else {
+  assert((Width == "256" || Width == "512") && "Invalid vector length.");
+  Features.push_back(Args.MakeArgString((IsNegative ? "-" : "+") + Name));
+}
   }
 
   // Now add any that the user explicitly requested on the command line,
diff --git a/clang/test/Driver/x86-target-features.c 
b/clang/test/Driver/x86-target-features.c
index 339f593dc760a8..18361251dcebc5 100644
--- a/clang/test/Driver/x86-target-features.c
+++ b/clang/test/Driver/x86-target-features.c
@@ -395,7 +395,8 @@
 // EVEX512: "-target-feature" "+evex512"
 // NO-EVEX512: "-target-feature" "-evex512"
 
-// RUN: %clang --target=i386 -mavx10.1 %s -### -o %t.o 2>&1 | FileCheck 
-check-prefix=AVX10_1_256 %s
+// RUN: not %clang --target=i386 -march=i386 -mavx10.1 %s -### -o %t.o 2>&1 | 
FileCheck -check-prefix=UNSUPPORT-AVX10 %s
+// RUN: not %clang --target=i386 -march=i386 -mno-avx10.1 %s -### -o %t.o 2>&1 
| FileChec

[llvm-branch-commits] [clang] [X86][AVX10] Disable m[no-]avx10.1 and switch m[no-]avx10.2 to alias of 512 bit options (#124511) (PR #125057)

2025-01-30 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang created 
https://github.com/llvm/llvm-project/pull/125057

Per the feedback we got, we’d like to switch m[no-]avx10.2 to alias of 512 bit 
options and disable m[no-]avx10.1 due to they were alias of 256 bit options.

We also change -mno-avx10.[1,2]-512 to alias of 256 bit options to disable both 
256 and 512 instructions.

Cherry-pick from 
https://github.com/llvm/llvm-project/commit/9ebfee9d686b41f789b47a6acc7dcdba808fb3f9

>From f816bd39f6986825e338198fce8747939ab1c882 Mon Sep 17 00:00:00 2001
From: Phoebe Wang 
Date: Thu, 30 Jan 2025 21:13:49 +0800
Subject: [PATCH] [X86][AVX10] Disable m[no-]avx10.1 and switch m[no-]avx10.2
 to alias of 512 bit options (#124511)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Per the feedback we got, we’d like to switch m[no-]avx10.2 to alias of
512 bit options and disable m[no-]avx10.1 due to they were alias of 256
bit options.

We also change -mno-avx10.[1,2]-512 to alias of 256 bit options to
disable both 256 and 512 instructions.
---
 clang/docs/ReleaseNotes.rst   |  4 
 clang/include/clang/Driver/Options.td | 12 +---
 clang/lib/Driver/ToolChains/Arch/X86.cpp  | 13 -
 clang/test/Driver/x86-target-features.c   | 16 ++--
 clang/test/Preprocessor/x86_target_features.c |  3 +--
 5 files changed, 28 insertions(+), 20 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index d8a94703bd9c57..c36a84c2b8362b 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -1159,6 +1159,10 @@ X86 Support
 - Support ISA of ``MOVRS``.
 
 - Supported ``-march/tune=diamondrapids``
+- Disable ``-m[no-]avx10.1`` and switch ``-m[no-]avx10.2`` to alias of 512 bit
+  options.
+- Change ``-mno-avx10.1-512`` to alias of ``-mno-avx10.1-256`` to disable both
+  256 and 512 bit instructions.
 
 Arm and AArch64 Support
 ^^^
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 1af633e59d0bba..a2b47b943ef90d 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -6441,15 +6441,13 @@ def mno_avx : Flag<["-"], "mno-avx">, 
Group;
 def mavx10_1_256 : Flag<["-"], "mavx10.1-256">, 
Group;
 def mno_avx10_1_256 : Flag<["-"], "mno-avx10.1-256">, 
Group;
 def mavx10_1_512 : Flag<["-"], "mavx10.1-512">, 
Group;
-def mno_avx10_1_512 : Flag<["-"], "mno-avx10.1-512">, 
Group;
-def mavx10_1 : Flag<["-"], "mavx10.1">, Alias;
-def mno_avx10_1 : Flag<["-"], "mno-avx10.1">, Alias;
+def mno_avx10_1_512 : Flag<["-"], "mno-avx10.1-512">, Alias;
+def mavx10_1 : Flag<["-"], "mavx10.1">, Flags<[Unsupported]>;
+def mno_avx10_1 : Flag<["-"], "mno-avx10.1">, Flags<[Unsupported]>;
 def mavx10_2_256 : Flag<["-"], "mavx10.2-256">, 
Group;
-def mno_avx10_2_256 : Flag<["-"], "mno-avx10.2-256">, 
Group;
 def mavx10_2_512 : Flag<["-"], "mavx10.2-512">, 
Group;
-def mno_avx10_2_512 : Flag<["-"], "mno-avx10.2-512">, 
Group;
-def mavx10_2 : Flag<["-"], "mavx10.2">, Alias;
-def mno_avx10_2 : Flag<["-"], "mno-avx10.2">, Alias;
+def mavx10_2 : Flag<["-"], "mavx10.2">, Alias;
+def mno_avx10_2 : Flag<["-"], "mno-avx10.2">, 
Group;
 def mavx2 : Flag<["-"], "mavx2">, Group;
 def mno_avx2 : Flag<["-"], "mno-avx2">, Group;
 def mavx512f : Flag<["-"], "mavx512f">, Group;
diff --git a/clang/lib/Driver/ToolChains/Arch/X86.cpp 
b/clang/lib/Driver/ToolChains/Arch/X86.cpp
index b2109e11038fe8..47c2c3e23f9fd9 100644
--- a/clang/lib/Driver/ToolChains/Arch/X86.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/X86.cpp
@@ -237,15 +237,18 @@ void x86::getX86TargetFeatures(const Driver &D, const 
llvm::Triple &Triple,
 
 bool IsNegative = Name.consume_front("no-");
 
-#ifndef NDEBUG
-assert(Name.starts_with("avx10.") && "Invalid AVX10 feature name.");
 StringRef Version, Width;
 std::tie(Version, Width) = Name.substr(6).split('-');
+assert(Name.starts_with("avx10.") && "Invalid AVX10 feature name.");
 assert((Version == "1" || Version == "2") && "Invalid AVX10 feature 
name.");
-assert((Width == "256" || Width == "512") && "Invalid AVX10 feature 
name.");
-#endif
 
-Features.push_back(Args.MakeArgString((IsNegative ? "-" : "+") + Name));
+if (Width == "") {
+  assert(IsNegative && "Only negative options can omit width.");
+  Features.push_back(Args.MakeArgString("-" + Name + "-256"));
+} else {
+  assert((Width == "256" || Width == "512") && "Invalid vector length.");
+  Features.push_back(Args.MakeArgString((IsNegative ? "-" : "+") + Name));
+}
   }
 
   // Now add any that the user explicitly requested on the command line,
diff --git a/clang/test/Driver/x86-target-features.c 
b/clang/test/Driver/x86-target-features.c
index 339f593dc760a8..18361251dcebc5 100644
--- a/clang/test/Driver/x86-target-features.c
+++ b/clang/test/Driver/x86-target-features.c
@@ -395,7 +395,8 @@
 // EVEX512: "-target-feature" "+eve

[llvm-branch-commits] [clang] [X86][AVX10] Disable m[no-]avx10.1 and switch m[no-]avx10.2 to alias of 512 bit options (#124511) (PR #125057)

2025-01-30 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang milestoned 
https://github.com/llvm/llvm-project/pull/125057
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [X86] When expanding LCMPXCHG16B_SAVE_RBX, substitute RBX in base (#134109) (PR #134331)

2025-04-05 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang approved this pull request.


https://github.com/llvm/llvm-project/pull/134331
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [X86] Backport new intrinsic and instruction changes in AVX10.2 (PR #133219)

2025-04-09 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

> > Doesn't this break ABI by changing intrinsic / builtin numbers?
> 
> So the headers could still be updated but must use the existing builtins in 
> the backport?

Sounds good, I can try with it. Thanks!

https://github.com/llvm/llvm-project/pull/133219
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [X86] Backport new intrinsic and instruction changes in AVX10.2 (PR #133219)

2025-04-13 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

> > > Doesn't this break ABI by changing intrinsic / builtin numbers?
> > 
> > 
> > So the headers could still be updated but must use the existing builtins in 
> > the backport?
> 
> Sounds good, I can try with it. Thanks!

A reduced one for intrinsics only 
https://github.com/llvm/llvm-project/pull/135549, PTAL.

https://github.com/llvm/llvm-project/pull/133219
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [X86] Backport new intrinsic and instruction changes in AVX10.2 (PR #133219)

2025-04-13 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang closed 
https://github.com/llvm/llvm-project/pull/133219
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [X86] When expanding LCMPXCHG16B_SAVE_RBX, substitute RBX in base (#134109) (PR #134331)

2025-04-04 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

> @phoebewang What do you think about merging this PR to the release branch?

I think it's ok to merge.

https://github.com/llvm/llvm-project/pull/134331
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [X86] Backport new intrinsic and instruction changes in AVX10.2 (PR #133219)

2025-03-27 Thread Phoebe Wang via llvm-branch-commits

https://github.com/phoebewang edited 
https://github.com/llvm/llvm-project/pull/133219
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] [X86] Backport new intrinsic and instruction changes in AVX10.2 (PR #133219)

2025-03-27 Thread Phoebe Wang via llvm-branch-commits

phoebewang wrote:

> Although these instructions aren't supported by current hardware this 
> probably needs a release notes entry - wdyt?

Sure, added in the description. Thanks for the reminder!

https://github.com/llvm/llvm-project/pull/133219
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits