from:"JunMa via Phabricator via cfe\-commits"

[PATCH] D69022: [coroutines] Remove assert on CoroutineParameterMoves in Sema::buildCoroutineParameterMoves

2019-10-16 Thread JunMa via Phabricator via cfe-commits

junparser created this revision.
junparser added reviewers: modocache, GorNishanov.
Herald added subscribers: cfe-commits, EricWF.
Herald added a project: clang.

The assertion of CoroutineParameterMoves happens when build coroutine function 
with arguments  multiple time while fails to build promise type.

Fix: use return false instead.

Test Plan: check-clang


Repository:
  rC Clang

https://reviews.llvm.org/D69022

Files:
  lib/Sema/SemaCoroutine.cpp
  test/SemaCXX/coroutines.cpp


Index: test/SemaCXX/coroutines.cpp
===
--- test/SemaCXX/coroutines.cpp
+++ test/SemaCXX/coroutines.cpp
@@ -87,6 +87,11 @@
   co_await a;
 }
 
+int no_promise_type_2(int a) { // expected-error {{this function cannot be a 
coroutine: 'std::experimental::coroutine_traits' has no member named 
'promise_type'}}
+  co_await a;
+  co_await a;
+}
+
 template <>
 struct std::experimental::coroutine_traits { typedef int 
promise_type; };
 double bad_promise_type(double) { // expected-error {{this function cannot be 
a coroutine: 'experimental::coroutine_traits::promise_type' 
(aka 'int') is not a class}}
Index: lib/Sema/SemaCoroutine.cpp
===
--- lib/Sema/SemaCoroutine.cpp
+++ lib/Sema/SemaCoroutine.cpp
@@ -1526,8 +1526,8 @@
   auto *FD = cast(CurContext);
 
   auto *ScopeInfo = getCurFunction();
-  assert(ScopeInfo->CoroutineParameterMoves.empty() &&
- "Should not build parameter moves twice");
+  if (!ScopeInfo->CoroutineParameterMoves.empty())
+return false;
 
   for (auto *PD : FD->parameters()) {
 if (PD->getType()->isDependentType())


Index: test/SemaCXX/coroutines.cpp
===
--- test/SemaCXX/coroutines.cpp
+++ test/SemaCXX/coroutines.cpp
@@ -87,6 +87,11 @@
   co_await a;
 }
 
+int no_promise_type_2(int a) { // expected-error {{this function cannot be a coroutine: 'std::experimental::coroutine_traits' has no member named 'promise_type'}}
+  co_await a;
+  co_await a;
+}
+
 template <>
 struct std::experimental::coroutine_traits { typedef int promise_type; };
 double bad_promise_type(double) { // expected-error {{this function cannot be a coroutine: 'experimental::coroutine_traits::promise_type' (aka 'int') is not a class}}
Index: lib/Sema/SemaCoroutine.cpp
===
--- lib/Sema/SemaCoroutine.cpp
+++ lib/Sema/SemaCoroutine.cpp
@@ -1526,8 +1526,8 @@
   auto *FD = cast(CurContext);
 
   auto *ScopeInfo = getCurFunction();
-  assert(ScopeInfo->CoroutineParameterMoves.empty() &&
- "Should not build parameter moves twice");
+  if (!ScopeInfo->CoroutineParameterMoves.empty())
+return false;
 
   for (auto *PD : FD->parameters()) {
 if (PD->getType()->isDependentType())
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D69022: [coroutines] Remove assert on CoroutineParameterMoves in Sema::buildCoroutineParameterMoves

2019-10-21 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

gental ping~  @modocache  @GorNishanov


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69022/new/

https://reviews.llvm.org/D69022



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D69022: [coroutines] Remove assert on CoroutineParameterMoves in Sema::buildCoroutineParameterMoves

2019-10-25 Thread JunMa via Phabricator via cfe-commits

junparser added a subscriber: rjmccall.
junparser added a comment.

In D69022#1720645 , @rjmccall wrote:

> Despite generally knowing about coroutines and generally knowing about Clang, 
> I actually don't know the Clang coroutine code and can't review this properly.


thanks  anyway~


Repository:
  rC Clang

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69022/new/

https://reviews.llvm.org/D69022



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D80979: [clang] Implement VectorType logic not operator.

2020-06-01 Thread JunMa via Phabricator via cfe-commits

junparser created this revision.
junparser added reviewers: erichkeane, aaron.ballman, rjmccall.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

As title. This patch implement unary operator ! of vector type.

TestPlan: check-clang


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D80979

Files:
  clang/docs/LanguageExtensions.rst
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/lib/Sema/SemaExpr.cpp
  clang/test/CodeGen/vector.c
  clang/test/Sema/vector-gcc-compat.cpp


Index: clang/test/Sema/vector-gcc-compat.cpp
===
--- clang/test/Sema/vector-gcc-compat.cpp
+++ clang/test/Sema/vector-gcc-compat.cpp
@@ -83,7 +83,7 @@
   v2i64 v2i64_c = (v2i64){3, 1}; // expected-warning {{compound literals are a 
C99-specific feature}}
   v2i64 v2i64_r;
 
-  v2i64_r = !v2i64_a;  // expected-error {{invalid argument type 'v2i64' 
(vector of 2 'long long' values) to unary expression}}
+  v2i64_r = !v2i64_a;
   v2i64_r = ~v2i64_a;
 
   v2i64_r = v2i64_a ? v2i64_b : v2i64_c;
Index: clang/test/CodeGen/vector.c
===
--- clang/test/CodeGen/vector.c
+++ clang/test/CodeGen/vector.c
@@ -80,3 +80,19 @@
 
 // CHECK: define void @lax_vector_compare2(<2 x i32>* {{.*sret.*}}, i64 
{{.*}}, i64 {{.*}})
 // CHECK: icmp eq <2 x i32>
+
+vec_int1 lax_vector_logic_not1(int x, vec_int1 y) {
+  y = x != y;
+  return y;
+}
+
+// CHECK: define i32 @lax_vector_logic_not1(i32 {{.*}}, i32 {{.*}})
+// CHECK: icmp ne i32
+
+vec_int2 lax_vector_logic_not2(long long x, vec_int2 y) {
+  y = x != y;
+  return y;
+}
+
+// CHECK: define void @lax_vector_logic_not2(<2 x i32>* {{.*sret.*}}, i64 
{{.*}}, i64 {{.*}})
+// CHECK: icmp ne <2 x i32>
Index: clang/lib/Sema/SemaExpr.cpp
===
--- clang/lib/Sema/SemaExpr.cpp
+++ clang/lib/Sema/SemaExpr.cpp
@@ -14436,12 +14436,19 @@
   return ExprError(Diag(OpLoc, diag::err_typecheck_unary_expr)
<< resultType << Input.get()->getSourceRange());
   }
+  // Vector logical not returns the signed variant of the operand type.
+  resultType = GetSignedVectorType(resultType);
+  break;
+} else if (Context.getLangOpts().CPlusPlus && resultType->isVectorType()) {
+  const VectorType *VTy = resultType->castAs();
+  if (VTy->getVectorKind() != VectorType::GenericVector)
+return ExprError(Diag(OpLoc, diag::err_typecheck_unary_expr)
+ << resultType << Input.get()->getSourceRange());
+
   // Vector logical not returns the signed variant of the operand type.
   resultType = GetSignedVectorType(resultType);
   break;
 } else {
-  // FIXME: GCC's vector extension permits the usage of '!' with a vector
-  //type in C++. We should allow that here too.
   return ExprError(Diag(OpLoc, diag::err_typecheck_unary_expr)
 << resultType << Input.get()->getSourceRange());
 }
Index: clang/lib/CodeGen/CGExprScalar.cpp
===
--- clang/lib/CodeGen/CGExprScalar.cpp
+++ clang/lib/CodeGen/CGExprScalar.cpp
@@ -2742,7 +2742,9 @@
 
 Value *ScalarExprEmitter::VisitUnaryLNot(const UnaryOperator *E) {
   // Perform vector logical not on comparison with zero vector.
-  if (E->getType()->isExtVectorType()) {
+  if (E->getType()->isVectorType() &&
+  E->getType()->castAs()->getVectorKind() ==
+  VectorType::GenericVector) {
 Value *Oper = Visit(E->getSubExpr());
 Value *Zero = llvm::Constant::getNullValue(Oper->getType());
 Value *Result;
Index: clang/docs/LanguageExtensions.rst
===
--- clang/docs/LanguageExtensions.rst
+++ clang/docs/LanguageExtensions.rst
@@ -475,7 +475,7 @@
 +,--,*,/,%   yes yes   yes --
 bitwise operators &,|,^,~yes yes   yes --
 >>,<, <, >=, <= yes yes   yes --
 =yes yes   yes yes
 :? [#]_  yes --yes --
@@ -488,7 +488,6 @@
 
 See also :ref:`langext-__builtin_shufflevector`, 
:ref:`langext-__builtin_convertvector`.
 
-.. [#] unary operator ! is not implemented, however && and || are.
 .. [#] While OpenCL and GCC vectors both implement the comparison operator(?:) 
as a
   'select', they operate somewhat differently. OpenCL selects based on 
signedness of
   the condition operands, but GCC vectors use normal bool conversions (that 
is, != 0).


Index: clang/test/Sema/vector-gcc-compat.cpp
===
--

[PATCH] D80979: [clang] Implement VectorType logic not operator.

2020-06-03 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

@erichkeane @aaron.ballman , kindly ping :)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80979/new/

https://reviews.llvm.org/D80979



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D80979: [clang] Implement VectorType logic not operator.

2020-06-03 Thread JunMa via Phabricator via cfe-commits

junparser marked 3 inline comments as done.
junparser added inline comments.



Comment at: clang/lib/CodeGen/CGExprScalar.cpp:2746
+  if (E->getType()->isVectorType() &&
+  E->getType()->castAs()->getVectorKind() ==
+  VectorType::GenericVector) {

erichkeane wrote:
> Why limit this to just the base vector type?  Doesn't this remove the 
> ext-vector implementation?
> 
> 
the kind of ext-vector is  GenericVector as well. so it also includes 
ext-vector.



Comment at: clang/lib/Sema/SemaExpr.cpp:14442
+  break;
+} else if (Context.getLangOpts().CPlusPlus && resultType->isVectorType()) {
+  const VectorType *VTy = resultType->castAs();

erichkeane wrote:
> Why C++ only?  It seems if we're doing this, it should be for all language 
> modes.
Here we keep the behavior  as same as gcc since ! of vector only allows with 
C++ in gcc



Comment at: clang/test/CodeGen/vector.c:90
+// CHECK: define i32 @lax_vector_logic_not1(i32 {{.*}}, i32 {{.*}})
+// CHECK: icmp ne i32
+

erichkeane wrote:
> Can you clarify what this is doing here?  It doesn't seem clear to me what 
> the output of this is.
> 
> Additionally, what about FP types?  What do we expect this to emit?
sorry for the confusing. it seems i add the wrong code which test the != rather 
than !.  I'll add the new testcases


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80979/new/

https://reviews.llvm.org/D80979



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D80979: [clang] Implement VectorType logic not operator.

2020-06-03 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

This patch implement the logic not operation of vector type. it keeps same 
behavior as gcc does (only allows in C++).  I'll update the wrong testcases


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80979/new/

https://reviews.llvm.org/D80979



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D80979: [clang] Implement VectorType logic not operator.

2020-06-03 Thread JunMa via Phabricator via cfe-commits

junparser updated this revision to Diff 268366.
junparser added a comment.

address the comment. 
hi @erichkeane, most of the function in vector-1.cpp  are copied from 
ext-vector.c with vector type changed to gcc vector type, they should emit same 
ir.  I add test7 and test8 which test logic operation of gcc vector type.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80979/new/

https://reviews.llvm.org/D80979

Files:
  clang/docs/LanguageExtensions.rst
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/lib/Sema/SemaExpr.cpp
  clang/test/CodeGen/vector-1.cpp
  clang/test/Sema/vector-gcc-compat.cpp

Index: clang/test/Sema/vector-gcc-compat.cpp
===
--- clang/test/Sema/vector-gcc-compat.cpp
+++ clang/test/Sema/vector-gcc-compat.cpp
@@ -83,7 +83,7 @@
   v2i64 v2i64_c = (v2i64){3, 1}; // expected-warning {{compound literals are a C99-specific feature}}
   v2i64 v2i64_r;
 
-  v2i64_r = !v2i64_a;  // expected-error {{invalid argument type 'v2i64' (vector of 2 'long long' values) to unary expression}}
+  v2i64_r = !v2i64_a;
   v2i64_r = ~v2i64_a;
 
   v2i64_r = v2i64_a ? v2i64_b : v2i64_c;
Index: clang/test/CodeGen/vector-1.cpp
===
--- /dev/null
+++ clang/test/CodeGen/vector-1.cpp
@@ -0,0 +1,216 @@
+// RUN: %clang_cc1 -emit-llvm %s -o - | FileCheck %s
+
+typedef __attribute__((__vector_size__(16))) float float4;
+typedef __attribute__((__vector_size__(16))) int int4;
+typedef __attribute__((__vector_size__(16))) unsigned int uint4;
+
+// CHECK: @_Z4testPDv4_f
+// CHECK: store <4 x float> 
+void test(float4 *out) {
+  *out = ((float4){1.0f, 2.0f, 3.0f, 4.0f});
+}
+
+// CHECK: @_Z5test1PDv4_f
+// CHECK: store <4 x float>
+// CHECK: store <4 x float>
+void test1(float4 *out) {
+  float a = 1.0f;
+  float b = 2.0f;
+  float c = 3.0f;
+  float d = 4.0f;
+  *out = ((float4){a, b, c, d});
+}
+
+// CHECK: @_Z5test3PDv4_fS0_f
+void test3(float4 *ap, float4 *bp, float c) {
+  float4 a = *ap;
+  float4 b = *bp;
+
+  // CHECK: fadd <4 x float>
+  // CHECK: fsub <4 x float>
+  // CHECK: fmul <4 x float>
+  // CHECK: fdiv <4 x float>
+  a = a + b;
+  a = a - b;
+  a = a * b;
+  a = a / b;
+
+  // CHECK: fadd <4 x float>
+  // CHECK: fsub <4 x float>
+  // CHECK: fmul <4 x float>
+  // CHECK: fdiv <4 x float>
+  a = a + c;
+  a = a - c;
+  a = a * c;
+  a = a / c;
+
+  // CHECK: fadd <4 x float>
+  // CHECK: fsub <4 x float>
+  // CHECK: fmul <4 x float>
+  // CHECK: fdiv <4 x float>
+  a += b;
+  a -= b;
+  a *= b;
+  a /= b;
+
+  // CHECK: fadd <4 x float>
+  // CHECK: fsub <4 x float>
+  // CHECK: fmul <4 x float>
+  // CHECK: fdiv <4 x float>
+  a += c;
+  a -= c;
+  a *= c;
+  a /= c;
+}
+
+// CHECK: @_Z5test4PDv4_iS0_i
+void test4(int4 *ap, int4 *bp, int c) {
+  int4 a = *ap;
+  int4 b = *bp;
+
+  // CHECK: add <4 x i32>
+  // CHECK: sub <4 x i32>
+  // CHECK: mul <4 x i32>
+  // CHECK: sdiv <4 x i32>
+  // CHECK: srem <4 x i32>
+  a = a + b;
+  a = a - b;
+  a = a * b;
+  a = a / b;
+  a = a % b;
+
+  // CHECK: add <4 x i32>
+  // CHECK: sub <4 x i32>
+  // CHECK: mul <4 x i32>
+  // CHECK: sdiv <4 x i32>
+  // CHECK: srem <4 x i32>
+  a = a + c;
+  a = a - c;
+  a = a * c;
+  a = a / c;
+  a = a % c;
+
+  // CHECK: add <4 x i32>
+  // CHECK: sub <4 x i32>
+  // CHECK: mul <4 x i32>
+  // CHECK: sdiv <4 x i32>
+  // CHECK: srem <4 x i32>
+  a += b;
+  a -= b;
+  a *= b;
+  a /= b;
+  a %= b;
+
+  // CHECK: add <4 x i32>
+  // CHECK: sub <4 x i32>
+  // CHECK: mul <4 x i32>
+  // CHECK: sdiv <4 x i32>
+  // CHECK: srem <4 x i32>
+  a += c;
+  a -= c;
+  a *= c;
+  a /= c;
+  a %= c;
+
+  // Vector comparisons.
+  // CHECK: icmp slt
+  // CHECK: icmp sle
+  // CHECK: icmp sgt
+  // CHECK: icmp sge
+  // CHECK: icmp eq
+  // CHECK: icmp ne
+  int4 cmp;
+  cmp = a < b;
+  cmp = a <= b;
+  cmp = a > b;
+  cmp = a >= b;
+  cmp = a == b;
+  cmp = a != b;
+}
+
+// CHECK: @_Z5test5PDv4_fS0_i
+void test5(float4 *ap, float4 *bp, int c) {
+  float4 a = *ap;
+  float4 b = *bp;
+
+  // Vector comparisons.
+  // CHECK: fcmp olt
+  // CHECK: fcmp ole
+  // CHECK: fcmp ogt
+  // CHECK: fcmp oge
+  // CHECK: fcmp oeq
+  // CHECK: fcmp une
+  int4 cmp;
+  cmp = a < b;
+  cmp = a <= b;
+  cmp = a > b;
+  cmp = a >= b;
+  cmp = a == b;
+  cmp = a != b;
+}
+
+// CHECK: @_Z5test6PDv4_jS0_j
+void test6(uint4 *ap, uint4 *bp, unsigned c) {
+  uint4 a = *ap;
+  uint4 b = *bp;
+  int4 d;
+
+  // CHECK: udiv <4 x i32>
+  // CHECK: urem <4 x i32>
+  a = a / b;
+  a = a % b;
+
+  // CHECK: udiv <4 x i32>
+  // CHECK: urem <4 x i32>
+  a = a / c;
+  a = a % c;
+
+  // CHECK: icmp ult
+  // CHECK: icmp ule
+  // CHECK: icmp ugt
+  // CHECK: icmp uge
+  // CHECK: icmp eq
+  // CHECK: icmp ne
+  d = a < b;
+  d = a <= b;
+  d = a > b;
+  d = a >= b;
+  d = a == b;
+  d = a != b;
+}
+
+// CHECK: @_Z5test7Dv4_j
+int4 test7(uint4 V0) {
+  // CHECK: [[CMP0:%.*]] = icmp eq <4 x i32> [[V0:%.*]], zeroinitializer
+  // CHECK-NEXT: [[V

[PATCH] D80979: [clang] Implement VectorType logic not operator.

2020-06-03 Thread JunMa via Phabricator via cfe-commits

junparser marked 2 inline comments as done.
junparser added inline comments.



Comment at: clang/test/CodeGen/vector-1.cpp:183
+// CHECK: @_Z5test7Dv4_j
+int4 test7(uint4 V0) {
+  // CHECK: [[CMP0:%.*]] = icmp eq <4 x i32> [[V0:%.*]], zeroinitializer

logic operation with vector int



Comment at: clang/test/CodeGen/vector-1.cpp:201
+// CHECK: @_Z5test8Dv4_fS_
+int4 test8(float4 V0, float4 V1) {
+  // CHECK: [[CMP0:%.*]] = fcmp oeq <4 x float> [[V0:%.*]], zeroinitializer

logic operation with vector float


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80979/new/

https://reviews.llvm.org/D80979



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D80979: [clang] Implement VectorType logic not operator.

2020-06-04 Thread JunMa via Phabricator via cfe-commits

junparser marked 2 inline comments as done.
junparser added inline comments.



Comment at: clang/test/CodeGen/vector-1.cpp:2
+// RUN: %clang_cc1 -emit-llvm %s -o - | FileCheck %s
+
+typedef __attribute__((__vector_size__(16))) float float4;

erichkeane wrote:
> I don't think copying the whole test from the other file is the right idea. 
> We already validate the rest of the operations on normal vectors in a number 
> of places.  If any of those are C++ tests, just add your tests there.  
> Otherwise this test should only validate the logical-not operator.
ok, I'll only add the logical-not operator test


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80979/new/

https://reviews.llvm.org/D80979



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D80979: [clang] Implement VectorType logic not operator.

2020-06-04 Thread JunMa via Phabricator via cfe-commits

junparser updated this revision to Diff 268649.
junparser added a comment.

address the comment.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80979/new/

https://reviews.llvm.org/D80979

Files:
  clang/docs/LanguageExtensions.rst
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/lib/Sema/SemaExpr.cpp
  clang/test/CodeGen/vector-logic-not.cpp
  clang/test/Sema/vector-gcc-compat.cpp

Index: clang/test/Sema/vector-gcc-compat.cpp
===
--- clang/test/Sema/vector-gcc-compat.cpp
+++ clang/test/Sema/vector-gcc-compat.cpp
@@ -83,7 +83,7 @@
   v2i64 v2i64_c = (v2i64){3, 1}; // expected-warning {{compound literals are a C99-specific feature}}
   v2i64 v2i64_r;
 
-  v2i64_r = !v2i64_a;  // expected-error {{invalid argument type 'v2i64' (vector of 2 'long long' values) to unary expression}}
+  v2i64_r = !v2i64_a;
   v2i64_r = ~v2i64_a;
 
   v2i64_r = v2i64_a ? v2i64_b : v2i64_c;
Index: clang/test/CodeGen/vector-logic-not.cpp
===
--- /dev/null
+++ clang/test/CodeGen/vector-logic-not.cpp
@@ -0,0 +1,21 @@
+// RUN: %clang_cc1 -emit-llvm %s -o - | FileCheck %s
+
+typedef __attribute__((__vector_size__(16))) float float4;
+typedef __attribute__((__vector_size__(16))) int int4;
+typedef __attribute__((__vector_size__(16))) unsigned int uint4;
+
+// CHECK: @_Z5test1Dv4_j
+int4 test1(uint4 V0) {
+  // CHECK: [[CMP0:%.*]] = icmp eq <4 x i32> [[V0:%.*]], zeroinitializer
+  // CHECK-NEXT: [[V1:%.*]] = sext <4 x i1> [[CMP0]] to <4 x i32>
+  int4 V = !V0;
+  return V;
+}
+
+// CHECK: @_Z5test2Dv4_fS_
+int4 test2(float4 V0, float4 V1) {
+  // CHECK: [[CMP0:%.*]] = fcmp oeq <4 x float> [[V0:%.*]], zeroinitializer
+  // CHECK-NEXT: [[V1:%.*]] = sext <4 x i1> [[CMP0]] to <4 x i32>
+  int4 V = !V0;
+  return V;
+}
Index: clang/lib/Sema/SemaExpr.cpp
===
--- clang/lib/Sema/SemaExpr.cpp
+++ clang/lib/Sema/SemaExpr.cpp
@@ -14442,12 +14442,19 @@
   return ExprError(Diag(OpLoc, diag::err_typecheck_unary_expr)
<< resultType << Input.get()->getSourceRange());
   }
+  // Vector logical not returns the signed variant of the operand type.
+  resultType = GetSignedVectorType(resultType);
+  break;
+} else if (Context.getLangOpts().CPlusPlus && resultType->isVectorType()) {
+  const VectorType *VTy = resultType->castAs();
+  if (VTy->getVectorKind() != VectorType::GenericVector)
+return ExprError(Diag(OpLoc, diag::err_typecheck_unary_expr)
+ << resultType << Input.get()->getSourceRange());
+
   // Vector logical not returns the signed variant of the operand type.
   resultType = GetSignedVectorType(resultType);
   break;
 } else {
-  // FIXME: GCC's vector extension permits the usage of '!' with a vector
-  //type in C++. We should allow that here too.
   return ExprError(Diag(OpLoc, diag::err_typecheck_unary_expr)
 << resultType << Input.get()->getSourceRange());
 }
Index: clang/lib/CodeGen/CGExprScalar.cpp
===
--- clang/lib/CodeGen/CGExprScalar.cpp
+++ clang/lib/CodeGen/CGExprScalar.cpp
@@ -2742,7 +2742,9 @@
 
 Value *ScalarExprEmitter::VisitUnaryLNot(const UnaryOperator *E) {
   // Perform vector logical not on comparison with zero vector.
-  if (E->getType()->isExtVectorType()) {
+  if (E->getType()->isVectorType() &&
+  E->getType()->castAs()->getVectorKind() ==
+  VectorType::GenericVector) {
 Value *Oper = Visit(E->getSubExpr());
 Value *Zero = llvm::Constant::getNullValue(Oper->getType());
 Value *Result;
Index: clang/docs/LanguageExtensions.rst
===
--- clang/docs/LanguageExtensions.rst
+++ clang/docs/LanguageExtensions.rst
@@ -475,7 +475,7 @@
 +,--,*,/,%   yes yes   yes --
 bitwise operators &,|,^,~yes yes   yes --
 >>,<, <, >=, <= yes yes   yes --
 =yes yes   yes yes
 ?: [#]_  yes --yes --
@@ -488,7 +488,6 @@
 
 See also :ref:`langext-__builtin_shufflevector`, :ref:`langext-__builtin_convertvector`.
 
-.. [#] unary operator ! is not implemented, however && and || are.
 .. [#] ternary operator(?:) has different behaviors depending on condition
   operand's vector type. If the condition is a GNU vector (i.e. __vector_size__),
   it's only available in C++ and uses normal bool conversions (that is, != 0).
___
cfe-commits mailing list
cfe-

[PATCH] D80979: [clang] Implement VectorType logic not operator.

2020-06-07 Thread JunMa via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGa0de3335edcf: [clang] Implement VectorType logic not 
operator. (authored by junparser).

Changed prior to commit:
  https://reviews.llvm.org/D80979?vs=268649&id=269084#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80979/new/

https://reviews.llvm.org/D80979

Files:
  clang/docs/LanguageExtensions.rst
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/lib/Sema/SemaExpr.cpp
  clang/test/CodeGen/vector-logic-not.cpp
  clang/test/Sema/vector-gcc-compat.cpp


Index: clang/test/Sema/vector-gcc-compat.cpp
===
--- clang/test/Sema/vector-gcc-compat.cpp
+++ clang/test/Sema/vector-gcc-compat.cpp
@@ -83,7 +83,7 @@
   v2i64 v2i64_c = (v2i64){3, 1}; // expected-warning {{compound literals are a 
C99-specific feature}}
   v2i64 v2i64_r;
 
-  v2i64_r = !v2i64_a;  // expected-error {{invalid argument type 'v2i64' 
(vector of 2 'long long' values) to unary expression}}
+  v2i64_r = !v2i64_a;
   v2i64_r = ~v2i64_a;
 
   v2i64_r = v2i64_a ? v2i64_b : v2i64_c;
Index: clang/test/CodeGen/vector-logic-not.cpp
===
--- /dev/null
+++ clang/test/CodeGen/vector-logic-not.cpp
@@ -0,0 +1,21 @@
+// RUN: %clang_cc1 -emit-llvm %s -o - | FileCheck %s
+
+typedef __attribute__((__vector_size__(16))) float float4;
+typedef __attribute__((__vector_size__(16))) int int4;
+typedef __attribute__((__vector_size__(16))) unsigned int uint4;
+
+// CHECK: @_Z5test1Dv4_j
+int4 test1(uint4 V0) {
+  // CHECK: [[CMP0:%.*]] = icmp eq <4 x i32> [[V0:%.*]], zeroinitializer
+  // CHECK-NEXT: [[V1:%.*]] = sext <4 x i1> [[CMP0]] to <4 x i32>
+  int4 V = !V0;
+  return V;
+}
+
+// CHECK: @_Z5test2Dv4_fS_
+int4 test2(float4 V0, float4 V1) {
+  // CHECK: [[CMP0:%.*]] = fcmp oeq <4 x float> [[V0:%.*]], zeroinitializer
+  // CHECK-NEXT: [[V1:%.*]] = sext <4 x i1> [[CMP0]] to <4 x i32>
+  int4 V = !V0;
+  return V;
+}
Index: clang/lib/Sema/SemaExpr.cpp
===
--- clang/lib/Sema/SemaExpr.cpp
+++ clang/lib/Sema/SemaExpr.cpp
@@ -14484,9 +14484,16 @@
   // Vector logical not returns the signed variant of the operand type.
   resultType = GetSignedVectorType(resultType);
   break;
+} else if (Context.getLangOpts().CPlusPlus && resultType->isVectorType()) {
+  const VectorType *VTy = resultType->castAs();
+  if (VTy->getVectorKind() != VectorType::GenericVector)
+return ExprError(Diag(OpLoc, diag::err_typecheck_unary_expr)
+ << resultType << Input.get()->getSourceRange());
+
+  // Vector logical not returns the signed variant of the operand type.
+  resultType = GetSignedVectorType(resultType);
+  break;
 } else {
-  // FIXME: GCC's vector extension permits the usage of '!' with a vector
-  //type in C++. We should allow that here too.
   return ExprError(Diag(OpLoc, diag::err_typecheck_unary_expr)
 << resultType << Input.get()->getSourceRange());
 }
Index: clang/lib/CodeGen/CGExprScalar.cpp
===
--- clang/lib/CodeGen/CGExprScalar.cpp
+++ clang/lib/CodeGen/CGExprScalar.cpp
@@ -2762,7 +2762,9 @@
 
 Value *ScalarExprEmitter::VisitUnaryLNot(const UnaryOperator *E) {
   // Perform vector logical not on comparison with zero vector.
-  if (E->getType()->isExtVectorType()) {
+  if (E->getType()->isVectorType() &&
+  E->getType()->castAs()->getVectorKind() ==
+  VectorType::GenericVector) {
 Value *Oper = Visit(E->getSubExpr());
 Value *Zero = llvm::Constant::getNullValue(Oper->getType());
 Value *Result;
Index: clang/docs/LanguageExtensions.rst
===
--- clang/docs/LanguageExtensions.rst
+++ clang/docs/LanguageExtensions.rst
@@ -475,7 +475,7 @@
 +,--,*,/,%   yes yes   yes --
 bitwise operators &,|,^,~yes yes   yes --
 >>,<, <, >=, <= yes yes   yes --
 =yes yes   yes yes
 ?: [#]_  yes --yes --
@@ -488,7 +488,6 @@
 
 See also :ref:`langext-__builtin_shufflevector`, 
:ref:`langext-__builtin_convertvector`.
 
-.. [#] unary operator ! is not implemented, however && and || are.
 .. [#] ternary operator(?:) has different behaviors depending on condition
   operand's vector type. If the condition is a GNU vector (i.e. 
__vector_size__),
   it's only available in C++ and uses normal bool conversions (that is, != 0).


Index: clang/test/Sema

[PATCH] D81543: [CodeGen][TLS] Set TLS Model for __tls_guard as well.

2020-06-10 Thread JunMa via Phabricator via cfe-commits

junparser created this revision.
junparser added reviewers: chh, rnk, aaron.ballman, rjmccall.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.
junparser edited the summary of this revision.

For now we do not set tls model for tls_guard with/without option -ftls-model 
which is suboptimal.  
This patch set model from command line for tls_guard as well. This keeps same 
behavior with gcc.

TestPlan: check-clang


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D81543

Files:
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/test/CodeGen/tls-model.c
  clang/test/CodeGen/tls-model.cpp

Index: clang/test/CodeGen/tls-model.cpp
===
--- clang/test/CodeGen/tls-model.cpp
+++ clang/test/CodeGen/tls-model.cpp
@@ -16,29 +16,52 @@
 }
 int __thread __attribute__((tls_model("initial-exec"))) z;
 
+struct S {
+  S();
+  ~S();
+};
+struct T {
+  ~T();
+};
+
+struct S thread_local s1;
+struct T thread_local t1;
+
 // Note that unlike normal C uninitialized global variables,
 // uninitialized TLS variables do NOT have COMMON linkage.
 
 // CHECK-GD: @z1 = global i32 0
-// CHECK-GD: @f.y = internal thread_local global i32 0
 // CHECK-GD: @z2 = global i32 0
 // CHECK-GD: @x = thread_local global i32 0
+// CHECK-GD: @_ZZ1fvE1y = internal thread_local global i32 0
 // CHECK-GD: @z = thread_local(initialexec) global i32 0
+// CHECK-GD: @s1 = thread_local global %struct.S zeroinitializer
+// CHECK-GD: @t1 = thread_local global %struct.T zeroinitializer
+// CHECK-GD: @__tls_guard = internal thread_local global i8 0
 
 // CHECK-LD: @z1 = global i32 0
-// CHECK-LD: @f.y = internal thread_local(localdynamic) global i32 0
 // CHECK-LD: @z2 = global i32 0
 // CHECK-LD: @x = thread_local(localdynamic) global i32 0
+// CHECK-LD: @_ZZ1fvE1y = internal thread_local(localdynamic) global i32 0
 // CHECK-LD: @z = thread_local(initialexec) global i32 0
+// CHECK-LD: @s1 = thread_local(localdynamic) global %struct.S zeroinitializer
+// CHECK-LD: @t1 = thread_local(localdynamic) global %struct.T zeroinitializer
+// CHECK-LD: @__tls_guard = internal thread_local(localdynamic) global i8 0
 
 // CHECK-IE: @z1 = global i32 0
-// CHECK-IE: @f.y = internal thread_local(initialexec) global i32 0
 // CHECK-IE: @z2 = global i32 0
 // CHECK-IE: @x = thread_local(initialexec) global i32 0
+// CHECK-IE: @_ZZ1fvE1y = internal thread_local(initialexec) global i32 0
 // CHECK-IE: @z = thread_local(initialexec) global i32 0
+// CHECK-IE: @s1 = thread_local(initialexec) global %struct.S zeroinitializer
+// CHECK-IE: @t1 = thread_local(initialexec) global %struct.T zeroinitializer
+// CHECK-IE: @__tls_guard = internal thread_local(initialexec) global i8 0
 
 // CHECK-LE: @z1 = global i32 0
-// CHECK-LE: @f.y = internal thread_local(localexec) global i32 0
 // CHECK-LE: @z2 = global i32 0
 // CHECK-LE: @x = thread_local(localexec) global i32 0
+// CHECK-LE: @_ZZ1fvE1y = internal thread_local(localexec) global i32 0
 // CHECK-LE: @z = thread_local(initialexec) global i32 0
+// CHECK-LE: @s1 = thread_local(localexec) global %struct.S zeroinitializer
+// CHECK-LE: @t1 = thread_local(localexec) global %struct.T zeroinitializer
+// CHECK-LE: @__tls_guard = internal thread_local(localexec) global i8 0
Index: clang/lib/CodeGen/ItaniumCXXABI.cpp
===
--- clang/lib/CodeGen/ItaniumCXXABI.cpp
+++ clang/lib/CodeGen/ItaniumCXXABI.cpp
@@ -2619,6 +2619,7 @@
 llvm::GlobalVariable::InternalLinkage,
 llvm::ConstantInt::get(CGM.Int8Ty, 0), "__tls_guard");
 Guard->setThreadLocal(true);
+Guard->setThreadLocalMode(CGM.GetDefaultLLVMTLSModel());
 
 CharUnits GuardAlign = CharUnits::One();
 Guard->setAlignment(GuardAlign.getAsAlign());
Index: clang/lib/CodeGen/CodeGenModule.h
===
--- clang/lib/CodeGen/CodeGenModule.h
+++ clang/lib/CodeGen/CodeGenModule.h
@@ -790,6 +790,9 @@
   /// variable declaration D.
   void setTLSMode(llvm::GlobalValue *GV, const VarDecl &D) const;
 
+  /// Get LLVM TLS mode from CodeGenOptions.
+  llvm::GlobalVariable::ThreadLocalMode GetDefaultLLVMTLSModel() const;
+
   static llvm::GlobalValue::VisibilityTypes GetLLVMVisibility(Visibility V) {
 switch (V) {
 case DefaultVisibility:   return llvm::GlobalValue::DefaultVisibility;
Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ clang/lib/CodeGen/CodeGenModule.cpp
@@ -971,9 +971,9 @@
   .Case("local-exec", llvm::GlobalVariable::LocalExecTLSModel);
 }
 
-static llvm::GlobalVariable::ThreadLocalMode GetLLVMTLSModel(
-CodeGenOptions::TLSModel M) {
-  switch (M) {
+llvm::GlobalVariable::ThreadLocalMode
+CodeGenModule::GetDefaultLLVMTLSModel() const {
+  switch (CodeGenOpts.ge

[PATCH] D81543: [CodeGen][TLS] Set TLS Model for __tls_guard as well.

2020-06-11 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

kindly ping~


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81543/new/

https://reviews.llvm.org/D81543



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D81543: [CodeGen][TLS] Set TLS Model for __tls_guard as well.

2020-06-14 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

@rnk @aaron.ballman any comments?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81543/new/

https://reviews.llvm.org/D81543



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D81543: [CodeGen][TLS] Set TLS Model for __tls_guard as well.

2020-06-15 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

@aaron.ballman thanks for the review. I have updated the patch.




Comment at: clang/lib/CodeGen/ItaniumCXXABI.cpp:2622
 Guard->setThreadLocal(true);
+Guard->setThreadLocalMode(CGM.GetDefaultLLVMTLSModel());
 

aaron.ballman wrote:
> Do we need a similar change in `MicrosoftCXXABI::EmitGuardedInit()`? I notice 
> it sets thread local to true but does not set the thread local mode.
Yes, It should be changed as well.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81543/new/

https://reviews.llvm.org/D81543



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D81543: [CodeGen][TLS] Set TLS Model for __tls_guard as well.

2020-06-15 Thread JunMa via Phabricator via cfe-commits

junparser updated this revision to Diff 270969.
junparser marked an inline comment as done.
junparser added a comment.

address the comments.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81543/new/

https://reviews.llvm.org/D81543

Files:
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/lib/CodeGen/MicrosoftCXXABI.cpp
  clang/test/CodeGen/tls-model.c
  clang/test/CodeGen/tls-model.cpp
  clang/test/CodeGenCXX/ms-thread_local.cpp

Index: clang/test/CodeGenCXX/ms-thread_local.cpp
===
--- clang/test/CodeGenCXX/ms-thread_local.cpp
+++ clang/test/CodeGenCXX/ms-thread_local.cpp
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -std=c++1y -triple=i686-pc-win32 -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s  -std=c++1y -triple=i686-pc-win32 -ftls-model=local-dynamic -emit-llvm -o - | FileCheck %s -check-prefix=CHECK-LD
 
 struct A {
   A();
@@ -8,15 +9,22 @@
 // CHECK-DAG: $"??$a@X@@3UA@@A" = comdat any
 // CHECK-DAG: @"??$a@X@@3UA@@A" = linkonce_odr dso_local thread_local global %struct.A zeroinitializer, comdat, align 1
 // CHECK-DAG: @"??__E?$a@X@@YAXXZ$initializer$" = internal constant void ()* @"??__E?$a@X@@YAXXZ", section ".CRT$XDU", comdat($"??$a@X@@3UA@@A")
+// CHECK-LD-DAG: $"??$a@X@@3UA@@A" = comdat any
+// CHECK-LD-DAG: @"??$a@X@@3UA@@A" = linkonce_odr dso_local thread_local(localdynamic) global %struct.A zeroinitializer, comdat, align 1
+// CHECK-LD-DAG: @"??__E?$a@X@@YAXXZ$initializer$" = internal constant void ()* @"??__E?$a@X@@YAXXZ", section ".CRT$XDU", comdat($"??$a@X@@3UA@@A")
 template 
 thread_local A a = A();
 
 // CHECK-DAG: @"?b@@3UA@@A" = dso_local thread_local global %struct.A zeroinitializer, align 1
 // CHECK-DAG: @"__tls_init$initializer$" = internal constant void ()* @__tls_init, section ".CRT$XDU"
+// CHECK-LD-DAG: @"?b@@3UA@@A" = dso_local thread_local(localdynamic) global %struct.A zeroinitializer, align 1
+// CHECK-LD-DAG: @"__tls_init$initializer$" = internal constant void ()* @__tls_init, section ".CRT$XDU"
 thread_local A b;
 
 // CHECK-LABEL: define internal void @__tls_init()
 // CHECK: call void @"??__Eb@@YAXXZ"
+// CHECK-LD-LABEL: define internal void @__tls_init()
+// CHECK-LD: call void @"??__Eb@@YAXXZ"
 
 thread_local A &c = b;
 thread_local A &d = c;
@@ -29,3 +37,5 @@
 
 // CHECK: !llvm.linker.options = !{![[dyn_tls_init:[0-9]+]]}
 // CHECK: ![[dyn_tls_init]] = !{!"/include:___dyn_tls_init@12"}
+// CHECK-LD: !llvm.linker.options = !{![[dyn_tls_init:[0-9]+]]}
+// CHECK-LD: ![[dyn_tls_init]] = !{!"/include:___dyn_tls_init@12"}
Index: clang/test/CodeGen/tls-model.cpp
===
--- clang/test/CodeGen/tls-model.cpp
+++ clang/test/CodeGen/tls-model.cpp
@@ -16,29 +16,52 @@
 }
 int __thread __attribute__((tls_model("initial-exec"))) z;
 
+struct S {
+  S();
+  ~S();
+};
+struct T {
+  ~T();
+};
+
+struct S thread_local s1;
+struct T thread_local t1;
+
 // Note that unlike normal C uninitialized global variables,
 // uninitialized TLS variables do NOT have COMMON linkage.
 
 // CHECK-GD: @z1 = global i32 0
-// CHECK-GD: @f.y = internal thread_local global i32 0
 // CHECK-GD: @z2 = global i32 0
 // CHECK-GD: @x = thread_local global i32 0
+// CHECK-GD: @_ZZ1fvE1y = internal thread_local global i32 0
 // CHECK-GD: @z = thread_local(initialexec) global i32 0
+// CHECK-GD: @s1 = thread_local global %struct.S zeroinitializer
+// CHECK-GD: @t1 = thread_local global %struct.T zeroinitializer
+// CHECK-GD: @__tls_guard = internal thread_local global i8 0
 
 // CHECK-LD: @z1 = global i32 0
-// CHECK-LD: @f.y = internal thread_local(localdynamic) global i32 0
 // CHECK-LD: @z2 = global i32 0
 // CHECK-LD: @x = thread_local(localdynamic) global i32 0
+// CHECK-LD: @_ZZ1fvE1y = internal thread_local(localdynamic) global i32 0
 // CHECK-LD: @z = thread_local(initialexec) global i32 0
+// CHECK-LD: @s1 = thread_local(localdynamic) global %struct.S zeroinitializer
+// CHECK-LD: @t1 = thread_local(localdynamic) global %struct.T zeroinitializer
+// CHECK-LD: @__tls_guard = internal thread_local(localdynamic) global i8 0
 
 // CHECK-IE: @z1 = global i32 0
-// CHECK-IE: @f.y = internal thread_local(initialexec) global i32 0
 // CHECK-IE: @z2 = global i32 0
 // CHECK-IE: @x = thread_local(initialexec) global i32 0
+// CHECK-IE: @_ZZ1fvE1y = internal thread_local(initialexec) global i32 0
 // CHECK-IE: @z = thread_local(initialexec) global i32 0
+// CHECK-IE: @s1 = thread_local(initialexec) global %struct.S zeroinitializer
+// CHECK-IE: @t1 = thread_local(initialexec) global %struct.T zeroinitializer
+// CHECK-IE: @__tls_guard = internal thread_local(initialexec) global i8 0
 
 // CHECK-LE: @z1 = global i32 0
-// CHECK-LE: @f.y = internal thread_local(localexec) global i32 0
 // CHECK-LE: @z2 = global i32 0
 // CHECK-LE: @x = thread_local(localexec) global i32 0
+// CHECK-LE: @_ZZ1fvE1y = internal thread_

[PATCH] D81543: [CodeGen][TLS] Set TLS Model for __tls_guard as well.

2020-06-16 Thread JunMa via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG4a1776979fd8: [CodeGen][TLS] Set TLS Model for __tls_guard 
as well. (authored by junparser).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D81543/new/

https://reviews.llvm.org/D81543

Files:
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/CodeGen/ItaniumCXXABI.cpp
  clang/lib/CodeGen/MicrosoftCXXABI.cpp
  clang/test/CodeGen/tls-model.c
  clang/test/CodeGen/tls-model.cpp
  clang/test/CodeGenCXX/ms-thread_local.cpp

Index: clang/test/CodeGenCXX/ms-thread_local.cpp
===
--- clang/test/CodeGenCXX/ms-thread_local.cpp
+++ clang/test/CodeGenCXX/ms-thread_local.cpp
@@ -1,4 +1,5 @@
 // RUN: %clang_cc1 %s -std=c++1y -triple=i686-pc-win32 -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 %s  -std=c++1y -triple=i686-pc-win32 -ftls-model=local-dynamic -emit-llvm -o - | FileCheck %s -check-prefix=CHECK-LD
 
 struct A {
   A();
@@ -8,15 +9,22 @@
 // CHECK-DAG: $"??$a@X@@3UA@@A" = comdat any
 // CHECK-DAG: @"??$a@X@@3UA@@A" = linkonce_odr dso_local thread_local global %struct.A zeroinitializer, comdat, align 1
 // CHECK-DAG: @"??__E?$a@X@@YAXXZ$initializer$" = internal constant void ()* @"??__E?$a@X@@YAXXZ", section ".CRT$XDU", comdat($"??$a@X@@3UA@@A")
+// CHECK-LD-DAG: $"??$a@X@@3UA@@A" = comdat any
+// CHECK-LD-DAG: @"??$a@X@@3UA@@A" = linkonce_odr dso_local thread_local(localdynamic) global %struct.A zeroinitializer, comdat, align 1
+// CHECK-LD-DAG: @"??__E?$a@X@@YAXXZ$initializer$" = internal constant void ()* @"??__E?$a@X@@YAXXZ", section ".CRT$XDU", comdat($"??$a@X@@3UA@@A")
 template 
 thread_local A a = A();
 
 // CHECK-DAG: @"?b@@3UA@@A" = dso_local thread_local global %struct.A zeroinitializer, align 1
 // CHECK-DAG: @"__tls_init$initializer$" = internal constant void ()* @__tls_init, section ".CRT$XDU"
+// CHECK-LD-DAG: @"?b@@3UA@@A" = dso_local thread_local(localdynamic) global %struct.A zeroinitializer, align 1
+// CHECK-LD-DAG: @"__tls_init$initializer$" = internal constant void ()* @__tls_init, section ".CRT$XDU"
 thread_local A b;
 
 // CHECK-LABEL: define internal void @__tls_init()
 // CHECK: call void @"??__Eb@@YAXXZ"
+// CHECK-LD-LABEL: define internal void @__tls_init()
+// CHECK-LD: call void @"??__Eb@@YAXXZ"
 
 thread_local A &c = b;
 thread_local A &d = c;
@@ -29,3 +37,5 @@
 
 // CHECK: !llvm.linker.options = !{![[dyn_tls_init:[0-9]+]]}
 // CHECK: ![[dyn_tls_init]] = !{!"/include:___dyn_tls_init@12"}
+// CHECK-LD: !llvm.linker.options = !{![[dyn_tls_init:[0-9]+]]}
+// CHECK-LD: ![[dyn_tls_init]] = !{!"/include:___dyn_tls_init@12"}
Index: clang/test/CodeGen/tls-model.cpp
===
--- clang/test/CodeGen/tls-model.cpp
+++ clang/test/CodeGen/tls-model.cpp
@@ -16,29 +16,52 @@
 }
 int __thread __attribute__((tls_model("initial-exec"))) z;
 
+struct S {
+  S();
+  ~S();
+};
+struct T {
+  ~T();
+};
+
+struct S thread_local s1;
+struct T thread_local t1;
+
 // Note that unlike normal C uninitialized global variables,
 // uninitialized TLS variables do NOT have COMMON linkage.
 
 // CHECK-GD: @z1 = global i32 0
-// CHECK-GD: @f.y = internal thread_local global i32 0
 // CHECK-GD: @z2 = global i32 0
 // CHECK-GD: @x = thread_local global i32 0
+// CHECK-GD: @_ZZ1fvE1y = internal thread_local global i32 0
 // CHECK-GD: @z = thread_local(initialexec) global i32 0
+// CHECK-GD: @s1 = thread_local global %struct.S zeroinitializer
+// CHECK-GD: @t1 = thread_local global %struct.T zeroinitializer
+// CHECK-GD: @__tls_guard = internal thread_local global i8 0
 
 // CHECK-LD: @z1 = global i32 0
-// CHECK-LD: @f.y = internal thread_local(localdynamic) global i32 0
 // CHECK-LD: @z2 = global i32 0
 // CHECK-LD: @x = thread_local(localdynamic) global i32 0
+// CHECK-LD: @_ZZ1fvE1y = internal thread_local(localdynamic) global i32 0
 // CHECK-LD: @z = thread_local(initialexec) global i32 0
+// CHECK-LD: @s1 = thread_local(localdynamic) global %struct.S zeroinitializer
+// CHECK-LD: @t1 = thread_local(localdynamic) global %struct.T zeroinitializer
+// CHECK-LD: @__tls_guard = internal thread_local(localdynamic) global i8 0
 
 // CHECK-IE: @z1 = global i32 0
-// CHECK-IE: @f.y = internal thread_local(initialexec) global i32 0
 // CHECK-IE: @z2 = global i32 0
 // CHECK-IE: @x = thread_local(initialexec) global i32 0
+// CHECK-IE: @_ZZ1fvE1y = internal thread_local(initialexec) global i32 0
 // CHECK-IE: @z = thread_local(initialexec) global i32 0
+// CHECK-IE: @s1 = thread_local(initialexec) global %struct.S zeroinitializer
+// CHECK-IE: @t1 = thread_local(initialexec) global %struct.T zeroinitializer
+// CHECK-IE: @__tls_guard = internal thread_local(initialexec) global i8 0
 
 // CHECK-LE: @z1 = global i32 0
-// CHECK-LE: @f.y = internal thread_local(localexec) global i32 0
 // CHECK-LE: @z2 = global i32 0
 // CHECK-LE: @x

[PATCH] D82314: [Coroutines] Optimize the lifespan of temporary co_await object

2020-07-03 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

In D82314#2125713 , @lxfind wrote:

> In D82314#2124662 , @junparser wrote:
>
> > In D82314#2124661 , @junparser 
> > wrote:
> >
> > > @lxfind This patch causes some mismatch when variable is used in both 
> > > resume and destroy function. Besides, we should move this patch and the 
> > > check in buildCoroutineFrame.
> >
> >
> > @lxfind, Would you try to fix this？ If you do not have time, then I'll try 
> > do this. Thanks
>
>
> Could you please help take a look, if you have a local repro? Thanks!


of course.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82314/new/

https://reviews.llvm.org/D82314



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D82442: [Coroutines] Warning if the return type of coroutine_handle::address is not void*

2020-07-05 Thread JunMa via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG8849831d55a2: [Coroutines] Warning if return type of 
coroutine_handle::address is not void* (authored by ChuanqiXu, committed by 
junparser).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Changed prior to commit:
  https://reviews.llvm.org/D82442?vs=273926&id=275589#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82442/new/

https://reviews.llvm.org/D82442

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/CodeGen/CodeGenFunction.h
  clang/lib/Sema/SemaCoroutine.cpp
  clang/test/SemaCXX/coroutine_handle-addres-return-type.cpp

Index: clang/test/SemaCXX/coroutine_handle-addres-return-type.cpp
===
--- /dev/null
+++ clang/test/SemaCXX/coroutine_handle-addres-return-type.cpp
@@ -0,0 +1,75 @@
+// RUN: %clang_cc1 -verify %s -stdlib=libc++ -std=c++1z -fcoroutines-ts -fsyntax-only
+
+namespace std::experimental {
+template 
+struct coroutine_handle;
+
+template <>
+struct coroutine_handle {
+  coroutine_handle() = default;
+  static coroutine_handle from_address(void *) noexcept;
+  void *address() const;
+};
+
+template 
+struct coroutine_handle : public coroutine_handle<> {
+};
+
+template 
+struct void_t_imp {
+  using type = void;
+};
+template 
+using void_t = typename void_t_imp::type;
+
+template 
+struct traits_sfinae_base {};
+
+template 
+struct traits_sfinae_base> {
+  using promise_type = typename T::promise_type;
+};
+
+template 
+struct coroutine_traits : public traits_sfinae_base {};
+} // namespace std::experimental
+
+struct suspend_never {
+  bool await_ready() noexcept;
+  void await_suspend(std::experimental::coroutine_handle<>) noexcept;
+  void await_resume() noexcept;
+};
+
+struct task {
+  struct promise_type {
+auto initial_suspend() { return suspend_never{}; }
+auto final_suspend() noexcept { return suspend_never{}; }
+auto get_return_object() { return task{}; }
+static void unhandled_exception() {}
+void return_void() {}
+  };
+};
+
+namespace std::experimental {
+template <>
+struct coroutine_handle : public coroutine_handle<> {
+  coroutine_handle *address() const; // expected-warning {{return type of 'coroutine_handle<>::address should be 'void*'}}
+};
+} // namespace std::experimental
+
+struct awaitable {
+  bool await_ready();
+
+  std::experimental::coroutine_handle
+  await_suspend(std::experimental::coroutine_handle<> handle);
+  void await_resume();
+} a;
+
+task f() {
+  co_await a;
+}
+
+int main() {
+  f();
+  return 0;
+}
Index: clang/lib/Sema/SemaCoroutine.cpp
===
--- clang/lib/Sema/SemaCoroutine.cpp
+++ clang/lib/Sema/SemaCoroutine.cpp
@@ -391,7 +391,13 @@
 return nullptr;
 
   Expr *JustAddress = AddressExpr.get();
-  // FIXME: Check that the type of AddressExpr is void*
+
+  // Check that the type of AddressExpr is void*
+  if (!JustAddress->getType().getTypePtr()->isVoidPointerType())
+S.Diag(cast(JustAddress)->getCalleeDecl()->getLocation(),
+   diag::warn_coroutine_handle_address_invalid_return_type)
+<< JustAddress->getType();
+
   return buildBuiltinCall(S, Loc, Builtin::BI__builtin_coro_resume,
   JustAddress);
 }
Index: clang/lib/CodeGen/CodeGenFunction.h
===
--- clang/lib/CodeGen/CodeGenFunction.h
+++ clang/lib/CodeGen/CodeGenFunction.h
@@ -1751,6 +1751,7 @@
   ~InlinedRegionBodyRAII() { CGF.AllocaInsertPt = OldAllocaIP; }
 };
   };
+
 private:
   /// CXXThisDecl - When generating code for a C++ member function,
   /// this will hold the implicit 'this' declaration.
Index: clang/include/clang/Basic/DiagnosticSemaKinds.td
===
--- clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -10527,6 +10527,9 @@
 def note_await_ready_no_bool_conversion : Note<
   "return type of 'await_ready' is required to be contextually convertible to 'bool'"
 >;
+def warn_coroutine_handle_address_invalid_return_type : Warning <
+  "return type of 'coroutine_handle<>::address should be 'void*' (have %0) in order to get capability with existing async C API.">,
+  InGroup;
 def err_coroutine_promise_final_suspend_requires_nothrow : Error<
   "the expression 'co_await __promise.final_suspend()' is required to be non-throwing"
 >;
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D82442: [Coroutines] Warning if the return type of coroutine_handle::address is not void*

2020-07-07 Thread JunMa via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG8849831d55a2: [Coroutines] Warning if return type of 
coroutine_handle::address is not void* (authored by ChuanqiXu, committed by 
junparser).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Changed prior to commit:
  https://reviews.llvm.org/D82442?vs=273926&id=275587#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82442/new/

https://reviews.llvm.org/D82442

Files:
  clang/include/clang/Basic/DiagnosticSemaKinds.td
  clang/lib/CodeGen/CodeGenFunction.h
  clang/lib/Sema/SemaCoroutine.cpp
  clang/test/SemaCXX/coroutine_handle-addres-return-type.cpp

Index: clang/test/SemaCXX/coroutine_handle-addres-return-type.cpp
===
--- /dev/null
+++ clang/test/SemaCXX/coroutine_handle-addres-return-type.cpp
@@ -0,0 +1,75 @@
+// RUN: %clang_cc1 -verify %s -stdlib=libc++ -std=c++1z -fcoroutines-ts -fsyntax-only
+
+namespace std::experimental {
+template 
+struct coroutine_handle;
+
+template <>
+struct coroutine_handle {
+  coroutine_handle() = default;
+  static coroutine_handle from_address(void *) noexcept;
+  void *address() const;
+};
+
+template 
+struct coroutine_handle : public coroutine_handle<> {
+};
+
+template 
+struct void_t_imp {
+  using type = void;
+};
+template 
+using void_t = typename void_t_imp::type;
+
+template 
+struct traits_sfinae_base {};
+
+template 
+struct traits_sfinae_base> {
+  using promise_type = typename T::promise_type;
+};
+
+template 
+struct coroutine_traits : public traits_sfinae_base {};
+} // namespace std::experimental
+
+struct suspend_never {
+  bool await_ready() noexcept;
+  void await_suspend(std::experimental::coroutine_handle<>) noexcept;
+  void await_resume() noexcept;
+};
+
+struct task {
+  struct promise_type {
+auto initial_suspend() { return suspend_never{}; }
+auto final_suspend() noexcept { return suspend_never{}; }
+auto get_return_object() { return task{}; }
+static void unhandled_exception() {}
+void return_void() {}
+  };
+};
+
+namespace std::experimental {
+template <>
+struct coroutine_handle : public coroutine_handle<> {
+  coroutine_handle *address() const; // expected-warning {{return type of 'coroutine_handle<>::address should be 'void*'}}
+};
+} // namespace std::experimental
+
+struct awaitable {
+  bool await_ready();
+
+  std::experimental::coroutine_handle
+  await_suspend(std::experimental::coroutine_handle<> handle);
+  void await_resume();
+} a;
+
+task f() {
+  co_await a;
+}
+
+int main() {
+  f();
+  return 0;
+}
Index: clang/lib/Sema/SemaCoroutine.cpp
===
--- clang/lib/Sema/SemaCoroutine.cpp
+++ clang/lib/Sema/SemaCoroutine.cpp
@@ -391,7 +391,13 @@
 return nullptr;
 
   Expr *JustAddress = AddressExpr.get();
-  // FIXME: Check that the type of AddressExpr is void*
+
+  // Check that the type of AddressExpr is void*
+  if (!JustAddress->getType().getTypePtr()->isVoidPointerType())
+S.Diag(cast(JustAddress)->getCalleeDecl()->getLocation(),
+   diag::warn_coroutine_handle_address_invalid_return_type)
+<< JustAddress->getType();
+
   return buildBuiltinCall(S, Loc, Builtin::BI__builtin_coro_resume,
   JustAddress);
 }
Index: clang/lib/CodeGen/CodeGenFunction.h
===
--- clang/lib/CodeGen/CodeGenFunction.h
+++ clang/lib/CodeGen/CodeGenFunction.h
@@ -1751,6 +1751,7 @@
   ~InlinedRegionBodyRAII() { CGF.AllocaInsertPt = OldAllocaIP; }
 };
   };
+
 private:
   /// CXXThisDecl - When generating code for a C++ member function,
   /// this will hold the implicit 'this' declaration.
Index: clang/include/clang/Basic/DiagnosticSemaKinds.td
===
--- clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -10527,6 +10527,9 @@
 def note_await_ready_no_bool_conversion : Note<
   "return type of 'await_ready' is required to be contextually convertible to 'bool'"
 >;
+def warn_coroutine_handle_address_invalid_return_type : Warning <
+  "return type of 'coroutine_handle<>::address should be 'void*' (have %0) in order to get capability with existing async C API.">,
+  InGroup;
 def err_coroutine_promise_final_suspend_requires_nothrow : Error<
   "the expression 'co_await __promise.final_suspend()' is required to be non-throwing"
 >;
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D82029: [Coroutines] Ensure co_await promise.final_suspend() does not throw

2020-06-22 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

In D82029#2100675 , @modocache wrote:

> Excellent, thank you! The test failures on the diff appear to be legitimate, 
> they reproduce for me when I apply this patch to my local checkout and run 
> `ninja check-clang`. Could you take a look?
>
> This LGTM otherwise! Also adding @junparser in case they have any thoughts, 
> if not I'll accept after the test failures are addressed, Thanks!


@modocache LGTM, @lxfind Thank you!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82029/new/

https://reviews.llvm.org/D82029



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D82314: [RFC][Coroutines] Optimize the lifespan of temporary co_await object

2020-06-22 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

Rather than doing it here, can we build await_resume call expression with 
MaterializedTemporaryExpr when expand the coawait expression. That's how gcc 
does.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82314/new/

https://reviews.llvm.org/D82314



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D82314: [RFC][Coroutines] Optimize the lifespan of temporary co_await object

2020-06-23 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

In D82314#2109893 , @rsmith wrote:

> In D82314#2109728 , @lxfind wrote:
>
> > @rsmith Thanks. That's a good point. Do you know if there already exists 
> > optimization passes in LLVM that attempts to shrink the range of lifetime 
> > intrinsics? If so, I am curious why that does not help in this case. Or is 
> > it generally unsafe to move the lifetime intrinsics, and we could only do 
> > it here with specific context knowledge about coroutines.
>
>
> I don't know for sure, but I would expect someone to have implemented such a 
> pass already. Moving a lifetime start intrinsic later, past instructions that 
> can't possibly reference the object in question, seems like it should always 
> be safe and (presumably) should always be a good thing to do, and similarly 
> for moving lifetime end markers earlier. It could be that such a pass exists 
> but it is run too late in the pass pipeline, so the coroutine split pass 
> doesn't get to take advantage of it.


@lxfind,  Also lifetime marker of variable are much complex because of the 
existing of exceptional path(multiple lifetime start & multiple lifetime end) , 
so it is hard to optimize such cases.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82314/new/

https://reviews.llvm.org/D82314



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D82314: [RFC][Coroutines] Optimize the lifespan of temporary co_await object

2020-06-23 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

In D82314#2109437 , @lxfind wrote:

> In D82314#2107910 , @junparser wrote:
>
> > Rather than doing it here, can we build await_resume call expression with 
> > MaterializedTemporaryExpr when expand the coawait expression. That's how 
> > gcc does.
>
>
> There doesn't appear to be a way to do that in Clang. It goes from the AST to 
> IR directly, and there needs to be a MaterializedTemporaryExpr to wrap the 
> result of co_await. Could you elaborate on how this might be done in Clang?

For now, we only wrap coawait expression with MaterializedTemporaryExpr when 
the kind of result is VK_RValue, We can wrap await_resume call instead in such 
case when build coawait expression. so in emitSuspendExpression, we can 
directly emit await_call expression with MaterializedTemporaryExpr.

I think this should work, although i'm not so sure.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82314/new/

https://reviews.llvm.org/D82314

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D82314: [RFC][Coroutines] Optimize the lifespan of temporary co_await object

2020-06-24 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

@lxfind, Thank you!  And could you please add some testcases?




Comment at: llvm/lib/Transforms/Coroutines/CoroSplit.cpp:1286
+continue;
+  if (CastInst) {
+// If we have multiple cast instructions for the alloca, don't

It is possible to handle multiple cast instructions as long as they are only 
used by lifetime marker intrinsic. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82314/new/

https://reviews.llvm.org/D82314



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D82314: [Coroutines] Optimize the lifespan of temporary co_await object

2020-06-29 Thread JunMa via Phabricator via cfe-commits

junparser accepted this revision.
junparser added a comment.
This revision is now accepted and ready to land.

LGTM， Thank you！


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82314/new/

https://reviews.llvm.org/D82314



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D82314: [Coroutines] Optimize the lifespan of temporary co_await object

2020-07-01 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

In D82314#2124661 , @junparser wrote:

> @lxfind This patch causes some mismatch when variable is used in both resume 
> and destroy function. Besides, we should move this patch and the check in 
> buildCoroutineFrame.


@lxfind, Would you try to fix this？ If you do not have time, then I'll try do 
this. Thanks


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82314/new/

https://reviews.llvm.org/D82314



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D82314: [Coroutines] Optimize the lifespan of temporary co_await object

2020-07-01 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

@lxfind This patch causes some mismatch when variable is used in both resume 
and destroy function. Besides, we should move this patch and the check in 
buildCoroutineFrame.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82314/new/

https://reviews.llvm.org/D82314



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D106333: [AArch64][SVE] Handle svbool_t VLST <-> VLAT/GNUT conversion

2021-07-19 Thread JunMa via Phabricator via cfe-commits

junparser created this revision.
junparser added reviewers: efriedma, bsmith, joechrisellis, c-rhodes, 
paulwalker-arm.
Herald added subscribers: psnobl, kristof.beyls, tschuett.
junparser requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

According to https://godbolt.org/z/q5rME1naY and acle, we found that
there are different SVE conversion behaviors between clang and gcc. It turns
out that llvm does not handle SVE predicates width properly.

This patch 1) checks SVE predicates width rightly with svbool_t type.

2. removes warning on svbool_t VLST <-> VLAT/GNUT conversion.
3. disables VLST <-> VLAT/GNUT conversion between SVE vectors and predicates

due to different width.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D106333

Files:
  clang/lib/AST/ASTContext.cpp
  clang/lib/Sema/SemaChecking.cpp
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
  clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
  clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp

Index: clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
===
--- clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
+++ clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
@@ -9,6 +9,10 @@
 typedef svint8_t fixed_int8_t __attribute__((arm_sve_vector_bits(N)));
 typedef int8_t gnu_int8_t __attribute__((vector_size(N / 8)));
 
+typedef __SVBool_t svbool_t;
+typedef svbool_t fixed_bool_t __attribute__((arm_sve_vector_bits(N)));
+typedef int8_t gnu_bool_t __attribute__((vector_size(N / 64)));
+
 template struct S { T var; };
 
 S s;
@@ -24,3 +28,11 @@
 // Test implicit casts between GNU and VLS vectors
 fixed_int8_t to_fixed_int8_t__from_gnu_int8_t(gnu_int8_t x) { return x; }
 gnu_int8_t from_fixed_int8_t__to_gnu_int8_t(fixed_int8_t x) { return x; }
+
+// Test implicit casts between VLA and VLS perdicates
+svbool_t to_svbool_t(fixed_bool_t x) { return x; }
+fixed_bool_t from_svbool_t(svbool_t x) { return x; }
+
+// Test implicit casts between GNU and VLA predicates
+svbool_t to_svbool_t__from_gnu_bool_t(gnu_bool_t x) { return x; }
+gnu_bool_t from_svbool_t__to_gnu_bool_t(svbool_t x) { return x; }
Index: clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
===
--- clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
+++ clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
@@ -2,22 +2,24 @@
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
 
-// lax-vector-all-no-diagnostics
-
 #include 
 
 #define N __ARM_FEATURE_SVE_BITS
 #define SVE_FIXED_ATTR __attribute__((arm_sve_vector_bits(N)))
 #define GNU_FIXED_ATTR __attribute__((vector_size(N / 8)))
+#define GNU_BOOL_FIXED_ATTR __attribute__((vector_size(N / 64)))
 
 typedef svfloat32_t sve_fixed_float32_t SVE_FIXED_ATTR;
 typedef svint32_t sve_fixed_int32_t SVE_FIXED_ATTR;
+typedef svbool_t sve_fixed_bool_t SVE_FIXED_ATTR;
 typedef float gnu_fixed_float32_t GNU_FIXED_ATTR;
 typedef int gnu_fixed_int32_t GNU_FIXED_ATTR;
+typedef int8_t gnu_fixed_bool_t GNU_BOOL_FIXED_ATTR;
 
 void sve_allowed_with_integer_lax_conversions() {
   sve_fixed_int32_t fi32;
   svint64_t si64;
+  svbool_t sb8;
 
   // The implicit cast here should fail if -flax-vector-conversions=none, but pass if
   // -flax-vector-conversions={integer,all}.
@@ -25,6 +27,15 @@
   // lax-vector-none-error@-1 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
   si64 = fi32;
   // lax-vector-none-error@-1 {{assigning to 'svint64_t' (aka '__SVInt64_t') from incompatible type}}
+
+  fi32 = sb8;
+  // lax-vector-none-error@-1 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
+  // lax-vector-integer-error@-2 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
+  // lax-vector-all-error@-3 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
+  sb8 = fi32;
+  // lax-vector-none-error@-1 {{assigning to 'svbool_t' (aka '__SVBool_t') from incompatible type}}
+  // lax-vector-integer-error@-2 {{assigning to 'svbool_t' (aka '__SVBool_t') from incompatible type}}
+  // lax-vector-all-error@-3 {{assigning to 'svbool_t' (aka '__SVBool_t') from incompatible type}}
 }
 
 void sve_allowed_with_all_lax_conversions() {
@@ -44,6 +55,7 @@
 void gnu_allowed_with_integer_lax_conversions() {
   gnu_fixed_int32_t fi32;
   svint64_t si64;
+  svbool_t sb8;
 
   // The implicit cast here

[PATCH] D106333: [AArch64][SVE] Handle svbool_t VLST <-> VLAT/GNUT conversion

2021-07-19 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

@efriedma with this patch,  all of conversion between VLST and VLAT should have 
same vector size(getElementType() * getElementCount()). The regression in 
D105097  will be fixed by using bitcast + 
vector.insert/extract directly


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106333/new/

https://reviews.llvm.org/D106333

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D106333: [AArch64][SVE] Handle svbool_t VLST <-> VLAT/GNUT conversion

2021-07-20 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

In D106333#2889168 , @junparser wrote:

> @efriedma with this patch,  all of conversion between VLST and VLAT should 
> have same vector size(getElementType() * getElementCount()). The regression 
> in D105097  will be fixed by using bitcast 
> + vector.insert/extract directly

OK, actually this is wrong due to vscale representation in llvm ir.  However, 
we can still use bitcast as long as we can handle <32*i1> <64*i1>... in 
backend? any suggestion about this? @efriedma


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106333/new/

https://reviews.llvm.org/D106333

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D106333: [AArch64][SVE] Handle svbool_t VLST <-> VLAT/GNUT conversion

2021-07-20 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

In D106333#2889859 , @paulwalker-arm 
wrote:

> In D106333#2889168 , @junparser 
> wrote:
>
>> @efriedma with this patch,  all of conversion between VLST and VLAT should 
>> have same vector size(getElementType() * getElementCount()). The regression 
>> in D105097  will be fixed by using bitcast 
>> + vector.insert/extract directly
>
> I hope I've not got the wrong end of the stick here but the above is our 
> intention.  As in, Arm is looking at replacing the "via memory predicate 
> casting" with a method that uses vector_of_i8s vector insert/extract with the 
> necessary bitcasting.  Before doing this we just had to fix up a bunch of 
> failing INSERT_SUBVECTOR cases when it comes to the illegal types this idiom 
> introduces.

Good to know this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106333/new/

https://reviews.llvm.org/D106333

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D106333: [AArch64][SVE] Handle svbool_t VLST <-> VLAT/GNUT conversion

2021-07-20 Thread JunMa via Phabricator via cfe-commits

junparser updated this revision to Diff 360352.
junparser added a comment.

Address comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106333/new/

https://reviews.llvm.org/D106333

Files:
  clang/lib/AST/ASTContext.cpp
  clang/lib/Sema/SemaChecking.cpp
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
  clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
  clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp

Index: clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
===
--- clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
+++ clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
@@ -9,6 +9,10 @@
 typedef svint8_t fixed_int8_t __attribute__((arm_sve_vector_bits(N)));
 typedef int8_t gnu_int8_t __attribute__((vector_size(N / 8)));
 
+typedef __SVBool_t svbool_t;
+typedef svbool_t fixed_bool_t __attribute__((arm_sve_vector_bits(N)));
+typedef int8_t gnu_bool_t __attribute__((vector_size(N / 64)));
+
 template struct S { T var; };
 
 S s;
@@ -24,3 +28,11 @@
 // Test implicit casts between GNU and VLS vectors
 fixed_int8_t to_fixed_int8_t__from_gnu_int8_t(gnu_int8_t x) { return x; }
 gnu_int8_t from_fixed_int8_t__to_gnu_int8_t(fixed_int8_t x) { return x; }
+
+// Test implicit casts between VLA and VLS perdicates
+svbool_t to_svbool_t(fixed_bool_t x) { return x; }
+fixed_bool_t from_svbool_t(svbool_t x) { return x; }
+
+// Test implicit casts between GNU and VLA predicates
+svbool_t to_svbool_t__from_gnu_bool_t(gnu_bool_t x) { return x; }
+gnu_bool_t from_svbool_t__to_gnu_bool_t(svbool_t x) { return x; }
Index: clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
===
--- clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
+++ clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
@@ -2,22 +2,24 @@
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
 
-// lax-vector-all-no-diagnostics
-
 #include 
 
 #define N __ARM_FEATURE_SVE_BITS
 #define SVE_FIXED_ATTR __attribute__((arm_sve_vector_bits(N)))
 #define GNU_FIXED_ATTR __attribute__((vector_size(N / 8)))
+#define GNU_BOOL_FIXED_ATTR __attribute__((vector_size(N / 64)))
 
 typedef svfloat32_t sve_fixed_float32_t SVE_FIXED_ATTR;
 typedef svint32_t sve_fixed_int32_t SVE_FIXED_ATTR;
+typedef svbool_t sve_fixed_bool_t SVE_FIXED_ATTR;
 typedef float gnu_fixed_float32_t GNU_FIXED_ATTR;
 typedef int gnu_fixed_int32_t GNU_FIXED_ATTR;
+typedef int8_t gnu_fixed_bool_t GNU_BOOL_FIXED_ATTR;
 
 void sve_allowed_with_integer_lax_conversions() {
   sve_fixed_int32_t fi32;
   svint64_t si64;
+  svbool_t sb8;
 
   // The implicit cast here should fail if -flax-vector-conversions=none, but pass if
   // -flax-vector-conversions={integer,all}.
@@ -25,6 +27,15 @@
   // lax-vector-none-error@-1 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
   si64 = fi32;
   // lax-vector-none-error@-1 {{assigning to 'svint64_t' (aka '__SVInt64_t') from incompatible type}}
+
+  fi32 = sb8;
+  // lax-vector-none-error@-1 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
+  // lax-vector-integer-error@-2 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
+  // lax-vector-all-error@-3 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
+  sb8 = fi32;
+  // lax-vector-none-error@-1 {{assigning to 'svbool_t' (aka '__SVBool_t') from incompatible type}}
+  // lax-vector-integer-error@-2 {{assigning to 'svbool_t' (aka '__SVBool_t') from incompatible type}}
+  // lax-vector-all-error@-3 {{assigning to 'svbool_t' (aka '__SVBool_t') from incompatible type}}
 }
 
 void sve_allowed_with_all_lax_conversions() {
@@ -44,6 +55,7 @@
 void gnu_allowed_with_integer_lax_conversions() {
   gnu_fixed_int32_t fi32;
   svint64_t si64;
+  svbool_t sb8;
 
   // The implicit cast here should fail if -flax-vector-conversions=none, but pass if
   // -flax-vector-conversions={integer,all}.
@@ -51,6 +63,15 @@
   // lax-vector-none-error@-1 {{assigning to 'gnu_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
   si64 = fi32;
   // lax-vector-none-error@-1 {{assigning to 'svint64_t' (aka '__SVInt64_t') from incompatible type}}
+
+  fi32 = sb8;
+  // lax-vector-none-error@-1 {{assigning to 'gnu_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
+  // lax-vector-integer-error@-2 {{assigning to 'gnu_fixed_int32_t' (vector of 1

[PATCH] D106333: [AArch64][SVE] Handle svbool_t VLST <-> VLAT/GNUT conversion

2021-07-21 Thread JunMa via Phabricator via cfe-commits

junparser added inline comments.



Comment at: clang/lib/AST/ASTContext.cpp:8677
+  return Ty->getKind() == BuiltinType::SveBool
+ ? Context.getLangOpts().ArmSveVectorBits / Context.getCharWidth()
+ : Context.getLangOpts().ArmSveVectorBits;

paulwalker-arm wrote:
> Out of interest is this indirection necessary? I mean we know sve predicates 
> are exactly an eighth the size of sve vectors so why not just use `8`?
Just want to keep same style since HandleArmSveVectorBitsTypeAttr use this.  8 
is make sense to me.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106333/new/

https://reviews.llvm.org/D106333

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D106333: [AArch64][SVE] Handle svbool_t VLST <-> VLAT/GNUT conversion

2021-07-21 Thread JunMa via Phabricator via cfe-commits

junparser updated this revision to Diff 360690.
junparser added a comment.

Address comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106333/new/

https://reviews.llvm.org/D106333

Files:
  clang/lib/AST/ASTContext.cpp
  clang/lib/Sema/SemaChecking.cpp
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
  clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
  clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp

Index: clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
===
--- clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
+++ clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
@@ -9,6 +9,10 @@
 typedef svint8_t fixed_int8_t __attribute__((arm_sve_vector_bits(N)));
 typedef int8_t gnu_int8_t __attribute__((vector_size(N / 8)));
 
+typedef __SVBool_t svbool_t;
+typedef svbool_t fixed_bool_t __attribute__((arm_sve_vector_bits(N)));
+typedef int8_t gnu_bool_t __attribute__((vector_size(N / 64)));
+
 template struct S { T var; };
 
 S s;
@@ -24,3 +28,11 @@
 // Test implicit casts between GNU and VLS vectors
 fixed_int8_t to_fixed_int8_t__from_gnu_int8_t(gnu_int8_t x) { return x; }
 gnu_int8_t from_fixed_int8_t__to_gnu_int8_t(fixed_int8_t x) { return x; }
+
+// Test implicit casts between VLA and VLS predicates
+svbool_t to_svbool_t(fixed_bool_t x) { return x; }
+fixed_bool_t from_svbool_t(svbool_t x) { return x; }
+
+// Test implicit casts between GNU and VLA predicates
+svbool_t to_svbool_t__from_gnu_bool_t(gnu_bool_t x) { return x; }
+gnu_bool_t from_svbool_t__to_gnu_bool_t(svbool_t x) { return x; }
Index: clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
===
--- clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
+++ clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
@@ -2,22 +2,25 @@
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
 
-// lax-vector-all-no-diagnostics
-
 #include 
 
 #define N __ARM_FEATURE_SVE_BITS
 #define SVE_FIXED_ATTR __attribute__((arm_sve_vector_bits(N)))
 #define GNU_FIXED_ATTR __attribute__((vector_size(N / 8)))
+#define GNU_BOOL_FIXED_ATTR __attribute__((vector_size(N / 64)))
 
 typedef svfloat32_t sve_fixed_float32_t SVE_FIXED_ATTR;
 typedef svint32_t sve_fixed_int32_t SVE_FIXED_ATTR;
+typedef svbool_t sve_fixed_bool_t SVE_FIXED_ATTR;
 typedef float gnu_fixed_float32_t GNU_FIXED_ATTR;
 typedef int gnu_fixed_int32_t GNU_FIXED_ATTR;
+typedef int8_t gnu_fixed_bool_t GNU_BOOL_FIXED_ATTR;
 
 void sve_allowed_with_integer_lax_conversions() {
   sve_fixed_int32_t fi32;
   svint64_t si64;
+  svbool_t sb8;
+  sve_fixed_bool_t fb8;
 
   // The implicit cast here should fail if -flax-vector-conversions=none, but pass if
   // -flax-vector-conversions={integer,all}.
@@ -25,6 +28,25 @@
   // lax-vector-none-error@-1 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
   si64 = fi32;
   // lax-vector-none-error@-1 {{assigning to 'svint64_t' (aka '__SVInt64_t') from incompatible type}}
+
+  fi32 = sb8;
+  // lax-vector-none-error@-1 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
+  // lax-vector-integer-error@-2 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
+  // lax-vector-all-error@-3 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
+  sb8 = fi32;
+  // lax-vector-none-error@-1 {{assigning to 'svbool_t' (aka '__SVBool_t') from incompatible type}}
+  // lax-vector-integer-error@-2 {{assigning to 'svbool_t' (aka '__SVBool_t') from incompatible type}}
+  // lax-vector-all-error@-3 {{assigning to 'svbool_t' (aka '__SVBool_t') from incompatible type}}
+
+  si64 = fb8;
+  // lax-vector-none-error@-1 {{assigning to 'svint64_t' (aka '__SVInt64_t') from incompatible type}}
+  // lax-vector-integer-error@-2 {{assigning to 'svint64_t' (aka '__SVInt64_t') from incompatible type}}
+  // lax-vector-all-error@-3 {{assigning to 'svint64_t' (aka '__SVInt64_t') from incompatible type}}
+
+  fb8 = si64;
+  // lax-vector-none-error@-1 {{assigning to 'sve_fixed_bool_t' (vector of 8 'unsigned char' values) from incompatible type}}
+  // lax-vector-integer-error@-2 {{assigning to 'sve_fixed_bool_t' (vector of 8 'unsigned char' values) from incompatible type}}
+  // lax-vector-all-error@-3 {{assigning to 'sve_fixed_bool_t' (vector of 8 'unsigned char' values) from incompatible type}}
 }
 
 void sve_allowed_with_all_lax_conv

[PATCH] D106333: [AArch64][SVE] Handle svbool_t VLST <-> VLAT/GNUT conversion

2021-07-21 Thread JunMa via Phabricator via cfe-commits

This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG599b2f00370e: [AArch64][SVE] Handle svbool_t VLST <-> 
VLAT/GNUT conversion (authored by junparser).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106333/new/

https://reviews.llvm.org/D106333

Files:
  clang/lib/AST/ASTContext.cpp
  clang/lib/Sema/SemaChecking.cpp
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/SemaCXX/aarch64-sve-explicit-casts-fixed-size.cpp
  clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
  clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp

Index: clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
===
--- clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
+++ clang/test/SemaCXX/attr-arm-sve-vector-bits.cpp
@@ -9,6 +9,10 @@
 typedef svint8_t fixed_int8_t __attribute__((arm_sve_vector_bits(N)));
 typedef int8_t gnu_int8_t __attribute__((vector_size(N / 8)));
 
+typedef __SVBool_t svbool_t;
+typedef svbool_t fixed_bool_t __attribute__((arm_sve_vector_bits(N)));
+typedef int8_t gnu_bool_t __attribute__((vector_size(N / 64)));
+
 template struct S { T var; };
 
 S s;
@@ -24,3 +28,11 @@
 // Test implicit casts between GNU and VLS vectors
 fixed_int8_t to_fixed_int8_t__from_gnu_int8_t(gnu_int8_t x) { return x; }
 gnu_int8_t from_fixed_int8_t__to_gnu_int8_t(fixed_int8_t x) { return x; }
+
+// Test implicit casts between VLA and VLS predicates
+svbool_t to_svbool_t(fixed_bool_t x) { return x; }
+fixed_bool_t from_svbool_t(svbool_t x) { return x; }
+
+// Test implicit casts between GNU and VLA predicates
+svbool_t to_svbool_t__from_gnu_bool_t(gnu_bool_t x) { return x; }
+gnu_bool_t from_svbool_t__to_gnu_bool_t(svbool_t x) { return x; }
Index: clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
===
--- clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
+++ clang/test/SemaCXX/aarch64-sve-lax-vector-conversions.cpp
@@ -2,22 +2,25 @@
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=integer -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-integer %s
 // RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -msve-vector-bits=512 -flax-vector-conversions=all -fallow-half-arguments-and-returns -ffreestanding -fsyntax-only -verify=lax-vector-all %s
 
-// lax-vector-all-no-diagnostics
-
 #include 
 
 #define N __ARM_FEATURE_SVE_BITS
 #define SVE_FIXED_ATTR __attribute__((arm_sve_vector_bits(N)))
 #define GNU_FIXED_ATTR __attribute__((vector_size(N / 8)))
+#define GNU_BOOL_FIXED_ATTR __attribute__((vector_size(N / 64)))
 
 typedef svfloat32_t sve_fixed_float32_t SVE_FIXED_ATTR;
 typedef svint32_t sve_fixed_int32_t SVE_FIXED_ATTR;
+typedef svbool_t sve_fixed_bool_t SVE_FIXED_ATTR;
 typedef float gnu_fixed_float32_t GNU_FIXED_ATTR;
 typedef int gnu_fixed_int32_t GNU_FIXED_ATTR;
+typedef int8_t gnu_fixed_bool_t GNU_BOOL_FIXED_ATTR;
 
 void sve_allowed_with_integer_lax_conversions() {
   sve_fixed_int32_t fi32;
   svint64_t si64;
+  svbool_t sb8;
+  sve_fixed_bool_t fb8;
 
   // The implicit cast here should fail if -flax-vector-conversions=none, but pass if
   // -flax-vector-conversions={integer,all}.
@@ -25,6 +28,25 @@
   // lax-vector-none-error@-1 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
   si64 = fi32;
   // lax-vector-none-error@-1 {{assigning to 'svint64_t' (aka '__SVInt64_t') from incompatible type}}
+
+  fi32 = sb8;
+  // lax-vector-none-error@-1 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
+  // lax-vector-integer-error@-2 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
+  // lax-vector-all-error@-3 {{assigning to 'sve_fixed_int32_t' (vector of 16 'int' values) from incompatible type}}
+  sb8 = fi32;
+  // lax-vector-none-error@-1 {{assigning to 'svbool_t' (aka '__SVBool_t') from incompatible type}}
+  // lax-vector-integer-error@-2 {{assigning to 'svbool_t' (aka '__SVBool_t') from incompatible type}}
+  // lax-vector-all-error@-3 {{assigning to 'svbool_t' (aka '__SVBool_t') from incompatible type}}
+
+  si64 = fb8;
+  // lax-vector-none-error@-1 {{assigning to 'svint64_t' (aka '__SVInt64_t') from incompatible type}}
+  // lax-vector-integer-error@-2 {{assigning to 'svint64_t' (aka '__SVInt64_t') from incompatible type}}
+  // lax-vector-all-error@-3 {{assigning to 'svint64_t' (aka '__SVInt64_t') from incompatible type}}
+
+  fb8 = si64;
+  // lax-vector-none-error@-1 {{assigning to 'sve_fixed_bool_t' (vector of 8 'unsigned char' values) from incompatible type}}
+  // lax-vector-integer-error@-2 {{assigning to 'sve_fixed_bool_t' (vector of 8 'unsigned char' values) from incompatible type}}
+  // lax-v

[PATCH] D106860: [clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate casts

2021-07-28 Thread JunMa via Phabricator via cfe-commits

junparser added inline comments.
Herald added a subscriber: ctetreau.



Comment at: clang/lib/CodeGen/CGExprScalar.cpp:2102
+  Src = Builder.CreateBitCast(Src, SrcTy);
+}
 if (ScalableSrc->getElementType() == FixedDst->getElementType()) {

I think this may also works for casting between vectors with different element 
types.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106860/new/

https://reviews.llvm.org/D106860

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D106860: [clang][AArch64][SVE] Avoid going through memory for fixed/scalable predicate casts

2021-07-29 Thread JunMa via Phabricator via cfe-commits

junparser added inline comments.



Comment at: clang/lib/CodeGen/CGExprScalar.cpp:2102
+  Src = Builder.CreateBitCast(Src, SrcTy);
+}
 if (ScalableSrc->getElementType() == FixedDst->getElementType()) {

bsmith wrote:
> junparser wrote:
> > I think this may also works for casting between vectors with different 
> > element types.
> A similar argument applies here as the other related ticket, in principal we 
> could, however it's not clear that there is a good use case for writing code 
> that would make use of this. So for now it's probably best to just deal with 
> predicates which are definitely a problem and other cases as they arise.
Although i believe this generates better code than using memory load/store.  
Thanks for explaining this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106860/new/

https://reviews.llvm.org/D106860

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D113336: [RISCV] Imply extensions in RISCVTargetInfo::initFeatureMap

2022-02-11 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.
Herald added a subscriber: pcwang-thead.

@eopXD, hi, this patch make us lost +relax and  -save-restore  by default, 
would you please fix it?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113336/new/

https://reviews.llvm.org/D113336

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D119541: [RISCV] Fix RISCVTargetInfo::initFeatureMap, add non-ISA features back after implication

2022-02-13 Thread JunMa via Phabricator via cfe-commits

junparser accepted this revision.
junparser added a comment.
This revision is now accepted and ready to land.

LGTM, thanks for the fix.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119541/new/

https://reviews.llvm.org/D119541

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D117087: [C++20] [Coroutines] Implement return value optimization for get_return_object

2022-02-15 Thread JunMa via Phabricator via cfe-commits

junparser accepted this revision.
junparser added a comment.
This revision is now accepted and ready to land.

LGTM. Thanks!




Comment at: clang/lib/CodeGen/CGCoroutine.cpp:650
 
-  if (Stmt *Ret = S.getReturnStmt())
+  if (Stmt *Ret = S.getReturnStmt()) {
+// Since we already emitted the return value above, so we shouldn't

can we just remove this?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117087/new/

https://reviews.llvm.org/D117087

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D117087: [C++20] [Coroutines] Implement return value optimization for get_return_object

2022-02-15 Thread JunMa via Phabricator via cfe-commits

junparser added inline comments.



Comment at: clang/lib/CodeGen/CGCoroutine.cpp:654
+cast(Ret)->setRetValue(nullptr);
 EmitStmt(Ret);
+  }

I mean, remove the if statements here since the retuen expr is null.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117087/new/

https://reviews.llvm.org/D117087

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D117087: [C++20] [Coroutines] Implement return value optimization for get_return_object

2022-02-15 Thread JunMa via Phabricator via cfe-commits

junparser added inline comments.



Comment at: clang/lib/CodeGen/CGCoroutine.cpp:654
+cast(Ret)->setRetValue(nullptr);
 EmitStmt(Ret);
+  }

ChuanqiXu wrote:
> junparser wrote:
> > I mean, remove the if statements here since the retuen expr is null.
> We couldn't. Since we need emit ret instruction if needed.
make sense to me.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117087/new/

https://reviews.llvm.org/D117087

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D71903: [Coroutines][6/6] Clang schedules new passes

2019-12-29 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

There is another issue we should consider:  clang is crashed when compile 
coroutine with -disable-llvm-passes and output an object file. 
Is it reasonable to run coroutine passes even  -disable-llvm-passes is enabled?




Comment at: clang/lib/CodeGen/BackendUtil.cpp:1227
+  
MPM.addPass(createModuleToPostOrderCGSCCPassAdaptor(CoroSplitPass()));
+  MPM.addPass(createModuleToFunctionPassAdaptor(CoroElidePass()));
+  
MPM.addPass(createModuleToPostOrderCGSCCPassAdaptor(CoroSplitPass()));

modocache wrote:
> junparser wrote:
> > Since coro elision depends on other optimization pass(inline and so on)  
> > implicitly,  how can we adjust the pipeline to achieve this.
> One option would be to use the new pass manager's registration callbacks, 
> like `PassBuilder::registerPipelineStartEPCallback` or 
> `PassBuilder::registerOptimizerLastEPCallback`. These work similarly to the 
> old pass manager's `PassManagerBuilder::addExtension`. That's something that 
> I think would be good to improve in a follow-up patch, but let me know if 
> you'd rather see it in this one.
yes,  please. It should be done in this patch sets. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71903/new/

https://reviews.llvm.org/D71903



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D71903: [Coroutines][6/6] Clang schedules new passes

2020-01-04 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

In D71903#1804016 , @modocache wrote:

> I'm currently working on ensuring that CGSCC optimizations are rerun to 
> optimize coroutine funclets -- the primary feedback I received on this and on 
> D71899  -- but I just realized I didn't 
> respond to one comment on this set of reviews, from @junparser:
>
> > There is another issue we should consider: clang is crashed when compile 
> > coroutine with -disable-llvm-passes and output an object file.
>
> It's always been the case, since the coroutine intrinsics and passes were 
> first added to LLVM, that attempting to codegen without first running 
> coroutine passes would cause a crash during instruction selection. So `clang 
> -Xclang -disable-llvm-passes -c` has always crashed Clang during LLVM ISel, 
> as it does in this example that uses Clang 9.0.0 and the legacy pass manager: 
> https://godbolt.org/z/Mj2R5G
>
> Personally I'm of the opinion that this is less than ideal... I may be wrong, 
> but I don't think there are very many other C++ features that *require* Clang 
> to run LLVM passes (perhaps the `always_inline` attribute requires LLVM 
> passes to be run for correctness? I'm not sure). So I would like to see this 
> eventually addressed somehow.
>
> > Is it reasonable to run coroutine passes even -disable-llvm-passes is 
> > enabled?
>
> My personal opinion is that this would not be reasonable. The option 
> `-disable-llvm-passes` should, from my point of view, prevent any and all 
> LLVM passes from being run. I also frequently make use of this option when 
> debugging the LLVM IR being output for C++ coroutines code, so if 
> `-disable-llvm-passes` didn't disable coroutines passes, I'd need another 
> option that did 
> (`-disable-llvm-passes-no-really-even-coroutine-passes-them-too` 😅).
>
> All this being said, considering this behavior has existed in the legacy PM 
> since day one, I think we should start a separate discussion on if/how to 
> change that behavior. I'm working on an update for these patches to address 
> funclet optimization, but the update will not change the fact that coroutine 
> passes are not run when `-disable-llvm-passes` is specified. I think that's 
> an orthogonal issue.


make sense to me, thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71903/new/

https://reviews.llvm.org/D71903



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D87470: [Coroutine][Sema] Tighten the lifetime of symmetric transfer returned handle

2020-09-10 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

Thanks for the change. LGTM, and testcase？


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87470/new/

https://reviews.llvm.org/D87470

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D87470: [Coroutine][Sema] Tighten the lifetime of symmetric transfer returned handle

2020-09-15 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.
Herald added a subscriber: modimo.

@lxfind , could you backport this to branch 11?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D87470/new/

https://reviews.llvm.org/D87470

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D89066: [Coroutine][Sema] Only tighten the suspend call temp lifetime for final awaiter

2020-10-11 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

why we should not do this with normal await call?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89066/new/

https://reviews.llvm.org/D89066

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D89066: [Coroutine][Sema] Only tighten the suspend call temp lifetime for final awaiter

2020-10-12 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

In D89066#2324151 , @lxfind wrote:

> In D89066#2324115 , @junparser wrote:
>
>> why we should not do this with normal await call?
>
> To be honest, I don't know yet. My understanding of how expression cleanup 
> and temp lifetime management is insufficient at the moment.
> But first of all, without adding any cleanup expression here, I saw ASAN 
> failures due to heap-use-after-free, because sometimes the frame have already 
> been destroyed after the await_suspend call, and yet we are still writing 
> into the frame due to unnecessarily cross-suspend lifetime. However, if I 
> apply the cleanup to all await_suepend calls, it also causes ASAN failures as 
> it's cleaning up data that's still alive.
> So this patch is more of a temporary walkaround to stop bleeding without 
> causing any trouble.
> I plan to get back to this latter after I am done with the spilling/alloca 
> issues.

I'm not familiar with ASAN instrumentation. Do you have any testcases to 
explain this?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89066/new/

https://reviews.llvm.org/D89066

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D89066: [Coroutine][Sema] Only tighten the suspend call temp lifetime for final awaiter

2020-10-12 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

In D89066#2325358 , @lxfind wrote:

> In D89066#2324291 , @junparser wrote:
>
>> In D89066#2324151 , @lxfind wrote:
>>
>>> In D89066#2324115 , @junparser 
>>> wrote:
>>>
 why we should not do this with normal await call?
>>>
>>> To be honest, I don't know yet. My understanding of how expression cleanup 
>>> and temp lifetime management is insufficient at the moment.
>>> But first of all, without adding any cleanup expression here, I saw ASAN 
>>> failures due to heap-use-after-free, because sometimes the frame have 
>>> already been destroyed after the await_suspend call, and yet we are still 
>>> writing into the frame due to unnecessarily cross-suspend lifetime. 
>>> However, if I apply the cleanup to all await_suepend calls, it also causes 
>>> ASAN failures as it's cleaning up data that's still alive.
>>> So this patch is more of a temporary walkaround to stop bleeding without 
>>> causing any trouble.
>>> I plan to get back to this latter after I am done with the spilling/alloca 
>>> issues.
>>
>> I'm not familiar with ASAN instrumentation. Do you have any testcases to 
>> explain this?
>
> Unfortunately I don't.  But this is not related to ASAN. Basically, this is 
> causing destructing of objects that should still be alive. I suspect that 
> it's because ExprWithCleanups always clean up temps that belongs to the full 
> expression, not just the sub-expression in it.

should we also need to add ExprWithCleanups when expand co_await to await_ready 
& await_suspend and await_resume expression which may fix this issue？


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89066/new/

https://reviews.llvm.org/D89066

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D108138: [SimplifyCFG] Remove switch statements before vectorization

2021-08-17 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

Since we already have LowerSwitchPass to transform switchinst, can we add a 
cost modle and run it before vectorization?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108138/new/

https://reviews.llvm.org/D108138

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D104852: [AArch64][SVEIntrinsicOpts] Convect cntb/h/w/d to vscale intrinsic or constant.

2021-06-24 Thread JunMa via Phabricator via cfe-commits

junparser updated this revision to Diff 354427.
junparser added a comment.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

update clang test.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104852/new/

https://reviews.llvm.org/D104852

Files:
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntb.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntd.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cnth.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntw.c
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
  llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-counting-elems.ll

Index: llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-counting-elems.ll
===
--- /dev/null
+++ llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-counting-elems.ll
@@ -0,0 +1,247 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -S -instcombine < %s | FileCheck %s
+
+target triple = "aarch64-unknown-linux-gnu"
+
+;
+; CNTB
+;
+
+define i64 @cntb_vl1() {
+; CHECK-LABEL: @cntb_vl1(
+; CHECK-NEXT:ret i64 1
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 1)
+  ret i64 %out
+}
+
+define i64 @cntb_vl2() {
+; CHECK-LABEL: @cntb_vl2(
+; CHECK-NEXT:ret i64 2
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 2)
+  ret i64 %out
+}
+
+define i64 @cntb_vl4() {
+; CHECK-LABEL: @cntb_vl4(
+; CHECK-NEXT:ret i64 4
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 4)
+  ret i64 %out
+}
+
+define i64 @cntb_mul3() {
+; CHECK-LABEL: @cntb_mul3(
+; CHECK-NEXT:ret i64 24
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntb(i32 8)
+  %out = mul i64 %cnt, 3
+  ret i64 %out
+}
+
+define i64 @cntb_mul4() {
+; CHECK-LABEL: @cntb_mul4(
+; CHECK-NEXT:ret i64 64
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntb(i32 9)
+  %out = mul i64 %cnt, 4
+  ret i64 %out
+}
+
+define i64 @cntb_all() {
+; CHECK-LABEL: @cntb_all(
+; CHECK-NEXT:[[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[TMP1]], 4
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 31)
+  ret i64 %out
+}
+
+;
+; CNTH
+;
+
+define i64 @cnth_vl1() {
+; CHECK-LABEL: @cnth_vl1(
+; CHECK-NEXT:ret i64 1
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 1)
+  ret i64 %out
+}
+
+define i64 @cnth_vl2() {
+; CHECK-LABEL: @cnth_vl2(
+; CHECK-NEXT:ret i64 2
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 2)
+  ret i64 %out
+}
+
+define i64 @cnth_vl4() {
+; CHECK-LABEL: @cnth_vl4(
+; CHECK-NEXT:ret i64 4
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 4)
+  ret i64 %out
+}
+
+define i64 @cnth_mul3() {
+; CHECK-LABEL: @cnth_mul3(
+; CHECK-NEXT:ret i64 24
+;
+  %cnt = call i64 @llvm.aarch64.sve.cnth(i32 8)
+  %out = mul i64 %cnt, 3
+  ret i64 %out
+}
+
+define i64 @cnth_mul4() {
+; CHECK-LABEL: @cnth_mul4(
+; CHECK-NEXT:[[CNT:%.*]] = call i64 @llvm.aarch64.sve.cnth(i32 9)
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[CNT]], 2
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %cnt = call i64 @llvm.aarch64.sve.cnth(i32 9)
+  %out = mul i64 %cnt, 4
+  ret i64 %out
+}
+
+define i64 @cnth_all() {
+; CHECK-LABEL: @cnth_all(
+; CHECK-NEXT:[[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[TMP1]], 3
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 31)
+  ret i64 %out
+}
+
+;
+; CNTW
+;
+
+define i64 @cntw_vl1() {
+; CHECK-LABEL: @cntw_vl1(
+; CHECK-NEXT:ret i64 1
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 1)
+  ret i64 %out
+}
+
+define i64 @cntw_vl2() {
+; CHECK-LABEL: @cntw_vl2(
+; CHECK-NEXT:ret i64 2
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 2)
+  ret i64 %out
+}
+
+define i64 @cntw_vl4() {
+; CHECK-LABEL: @cntw_vl4(
+; CHECK-NEXT:ret i64 4
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 4)
+  ret i64 %out
+}
+
+define i64 @cntw_mul3() {
+; CHECK-LABEL: @cntw_mul3(
+; CHECK-NEXT:[[CNT:%.*]] = call i64 @llvm.aarch64.sve.cntw(i32 8)
+; CHECK-NEXT:[[OUT:%.*]] = mul i64 [[CNT]], 3
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntw(i32 8)
+  %out = mul i64 %cnt, 3
+  ret i64 %out
+}
+
+define i64 @cntw_mul4() {
+; CHECK-LABEL: @cntw_mul4(
+; CHECK-NEXT:[[CNT:%.*]] = call i64 @llvm.aarch64.sve.cntw(i32 9)
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[CNT]], 2
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntw(i32 9)
+  %out = mul i64 %cnt, 4
+  ret i64 %out
+}
+
+define i64 @cntw_all() {
+; CHECK-LABEL: @cntw_all(
+; CHECK-NEXT:[[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[TMP1]], 2
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 31)
+  ret i64 %out
+}
+
+
+;
+; CNTD
+;
+
+define i64 @cntd_vl1() {
+; CHECK-LABEL: @cntd_vl1(
+; CHECK-NEXT:ret i64 1
+;
+  %out = call i64 @llvm.aarch

[PATCH] D104852: [AArch64][SVEIntrinsicOpts] Convect cntb/h/w/d to vscale intrinsic or constant.

2021-06-27 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

@sdesmalen @david-arm @paulwalker-arm kindly ping~


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104852/new/

https://reviews.llvm.org/D104852

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D104852: [AArch64][SVEIntrinsicOpts] Convect cntb/h/w/d to vscale intrinsic or constant.

2021-06-28 Thread JunMa via Phabricator via cfe-commits

junparser added inline comments.



Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:662
+return IC.replaceInstUsesWith(II, StepVal);
+  } else if (Pattern == AArch64SVEPredPattern::vl16 && NumElts == 16) {
+Constant *StepVal = ConstantInt::get(II.getType(), NumElts);

david-arm wrote:
> Could you potentially fold these two cases into one somehow? Maybe with a 
> switch-case statement? I'm just imagining a situation where we might want 
> other patterns too like vl32, vl64, etc.
> 
There is no other special pattern except vl16. But I do think switch-case is 
more straightforward


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104852/new/

https://reviews.llvm.org/D104852

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D104852: [AArch64][SVEIntrinsicOpts] Convect cntb/h/w/d to vscale intrinsic or constant.

2021-06-28 Thread JunMa via Phabricator via cfe-commits

junparser updated this revision to Diff 354846.
junparser added a comment.

Address comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104852/new/

https://reviews.llvm.org/D104852

Files:
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntb.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntd.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cnth.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntw.c
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
  llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-counting-elems.ll

Index: llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-counting-elems.ll
===
--- /dev/null
+++ llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-counting-elems.ll
@@ -0,0 +1,247 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -S -instcombine < %s | FileCheck %s
+
+target triple = "aarch64-unknown-linux-gnu"
+
+;
+; CNTB
+;
+
+define i64 @cntb_vl1() {
+; CHECK-LABEL: @cntb_vl1(
+; CHECK-NEXT:ret i64 1
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 1)
+  ret i64 %out
+}
+
+define i64 @cntb_vl2() {
+; CHECK-LABEL: @cntb_vl2(
+; CHECK-NEXT:ret i64 2
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 2)
+  ret i64 %out
+}
+
+define i64 @cntb_vl4() {
+; CHECK-LABEL: @cntb_vl4(
+; CHECK-NEXT:ret i64 4
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 4)
+  ret i64 %out
+}
+
+define i64 @cntb_mul3() {
+; CHECK-LABEL: @cntb_mul3(
+; CHECK-NEXT:ret i64 24
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntb(i32 8)
+  %out = mul i64 %cnt, 3
+  ret i64 %out
+}
+
+define i64 @cntb_mul4() {
+; CHECK-LABEL: @cntb_mul4(
+; CHECK-NEXT:ret i64 64
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntb(i32 9)
+  %out = mul i64 %cnt, 4
+  ret i64 %out
+}
+
+define i64 @cntb_all() {
+; CHECK-LABEL: @cntb_all(
+; CHECK-NEXT:[[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[TMP1]], 4
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 31)
+  ret i64 %out
+}
+
+;
+; CNTH
+;
+
+define i64 @cnth_vl1() {
+; CHECK-LABEL: @cnth_vl1(
+; CHECK-NEXT:ret i64 1
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 1)
+  ret i64 %out
+}
+
+define i64 @cnth_vl2() {
+; CHECK-LABEL: @cnth_vl2(
+; CHECK-NEXT:ret i64 2
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 2)
+  ret i64 %out
+}
+
+define i64 @cnth_vl4() {
+; CHECK-LABEL: @cnth_vl4(
+; CHECK-NEXT:ret i64 4
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 4)
+  ret i64 %out
+}
+
+define i64 @cnth_mul3() {
+; CHECK-LABEL: @cnth_mul3(
+; CHECK-NEXT:ret i64 24
+;
+  %cnt = call i64 @llvm.aarch64.sve.cnth(i32 8)
+  %out = mul i64 %cnt, 3
+  ret i64 %out
+}
+
+define i64 @cnth_mul4() {
+; CHECK-LABEL: @cnth_mul4(
+; CHECK-NEXT:[[CNT:%.*]] = call i64 @llvm.aarch64.sve.cnth(i32 9)
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[CNT]], 2
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %cnt = call i64 @llvm.aarch64.sve.cnth(i32 9)
+  %out = mul i64 %cnt, 4
+  ret i64 %out
+}
+
+define i64 @cnth_all() {
+; CHECK-LABEL: @cnth_all(
+; CHECK-NEXT:[[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[TMP1]], 3
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 31)
+  ret i64 %out
+}
+
+;
+; CNTW
+;
+
+define i64 @cntw_vl1() {
+; CHECK-LABEL: @cntw_vl1(
+; CHECK-NEXT:ret i64 1
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 1)
+  ret i64 %out
+}
+
+define i64 @cntw_vl2() {
+; CHECK-LABEL: @cntw_vl2(
+; CHECK-NEXT:ret i64 2
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 2)
+  ret i64 %out
+}
+
+define i64 @cntw_vl4() {
+; CHECK-LABEL: @cntw_vl4(
+; CHECK-NEXT:ret i64 4
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 4)
+  ret i64 %out
+}
+
+define i64 @cntw_mul3() {
+; CHECK-LABEL: @cntw_mul3(
+; CHECK-NEXT:[[CNT:%.*]] = call i64 @llvm.aarch64.sve.cntw(i32 8)
+; CHECK-NEXT:[[OUT:%.*]] = mul i64 [[CNT]], 3
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntw(i32 8)
+  %out = mul i64 %cnt, 3
+  ret i64 %out
+}
+
+define i64 @cntw_mul4() {
+; CHECK-LABEL: @cntw_mul4(
+; CHECK-NEXT:[[CNT:%.*]] = call i64 @llvm.aarch64.sve.cntw(i32 9)
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[CNT]], 2
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntw(i32 9)
+  %out = mul i64 %cnt, 4
+  ret i64 %out
+}
+
+define i64 @cntw_all() {
+; CHECK-LABEL: @cntw_all(
+; CHECK-NEXT:[[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[TMP1]], 2
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 31)
+  ret i64 %out
+}
+
+
+;
+; CNTD
+;
+
+define i64 @cntd_vl1() {
+; CHECK-LABEL: @cntd_vl1(
+; CHECK-NEXT:ret i64 1
+;
+  %out = call i64 @llvm.aarch64.sve.cntd(i32 1)
+  ret i64 %out
+}
+
+define i64 @cntd_vl2() {
+; CHE

[PATCH] D104852: [AArch64][SVEIntrinsicOpts] Convect cntb/h/w/d to vscale intrinsic or constant.

2021-06-28 Thread JunMa via Phabricator via cfe-commits

junparser updated this revision to Diff 354857.
junparser added a comment.

address comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104852/new/

https://reviews.llvm.org/D104852

Files:
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntb.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntd.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cnth.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntw.c
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
  llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-counting-elems.ll

Index: llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-counting-elems.ll
===
--- /dev/null
+++ llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-counting-elems.ll
@@ -0,0 +1,247 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -S -instcombine < %s | FileCheck %s
+
+target triple = "aarch64-unknown-linux-gnu"
+
+;
+; CNTB
+;
+
+define i64 @cntb_vl1() {
+; CHECK-LABEL: @cntb_vl1(
+; CHECK-NEXT:ret i64 1
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 1)
+  ret i64 %out
+}
+
+define i64 @cntb_vl2() {
+; CHECK-LABEL: @cntb_vl2(
+; CHECK-NEXT:ret i64 2
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 2)
+  ret i64 %out
+}
+
+define i64 @cntb_vl4() {
+; CHECK-LABEL: @cntb_vl4(
+; CHECK-NEXT:ret i64 4
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 4)
+  ret i64 %out
+}
+
+define i64 @cntb_mul3() {
+; CHECK-LABEL: @cntb_mul3(
+; CHECK-NEXT:ret i64 24
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntb(i32 8)
+  %out = mul i64 %cnt, 3
+  ret i64 %out
+}
+
+define i64 @cntb_mul4() {
+; CHECK-LABEL: @cntb_mul4(
+; CHECK-NEXT:ret i64 64
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntb(i32 9)
+  %out = mul i64 %cnt, 4
+  ret i64 %out
+}
+
+define i64 @cntb_all() {
+; CHECK-LABEL: @cntb_all(
+; CHECK-NEXT:[[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[TMP1]], 4
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 31)
+  ret i64 %out
+}
+
+;
+; CNTH
+;
+
+define i64 @cnth_vl1() {
+; CHECK-LABEL: @cnth_vl1(
+; CHECK-NEXT:ret i64 1
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 1)
+  ret i64 %out
+}
+
+define i64 @cnth_vl2() {
+; CHECK-LABEL: @cnth_vl2(
+; CHECK-NEXT:ret i64 2
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 2)
+  ret i64 %out
+}
+
+define i64 @cnth_vl4() {
+; CHECK-LABEL: @cnth_vl4(
+; CHECK-NEXT:ret i64 4
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 4)
+  ret i64 %out
+}
+
+define i64 @cnth_mul3() {
+; CHECK-LABEL: @cnth_mul3(
+; CHECK-NEXT:ret i64 24
+;
+  %cnt = call i64 @llvm.aarch64.sve.cnth(i32 8)
+  %out = mul i64 %cnt, 3
+  ret i64 %out
+}
+
+define i64 @cnth_mul4() {
+; CHECK-LABEL: @cnth_mul4(
+; CHECK-NEXT:[[CNT:%.*]] = call i64 @llvm.aarch64.sve.cnth(i32 9)
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[CNT]], 2
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %cnt = call i64 @llvm.aarch64.sve.cnth(i32 9)
+  %out = mul i64 %cnt, 4
+  ret i64 %out
+}
+
+define i64 @cnth_all() {
+; CHECK-LABEL: @cnth_all(
+; CHECK-NEXT:[[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[TMP1]], 3
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 31)
+  ret i64 %out
+}
+
+;
+; CNTW
+;
+
+define i64 @cntw_vl1() {
+; CHECK-LABEL: @cntw_vl1(
+; CHECK-NEXT:ret i64 1
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 1)
+  ret i64 %out
+}
+
+define i64 @cntw_vl2() {
+; CHECK-LABEL: @cntw_vl2(
+; CHECK-NEXT:ret i64 2
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 2)
+  ret i64 %out
+}
+
+define i64 @cntw_vl4() {
+; CHECK-LABEL: @cntw_vl4(
+; CHECK-NEXT:ret i64 4
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 4)
+  ret i64 %out
+}
+
+define i64 @cntw_mul3() {
+; CHECK-LABEL: @cntw_mul3(
+; CHECK-NEXT:[[CNT:%.*]] = call i64 @llvm.aarch64.sve.cntw(i32 8)
+; CHECK-NEXT:[[OUT:%.*]] = mul i64 [[CNT]], 3
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntw(i32 8)
+  %out = mul i64 %cnt, 3
+  ret i64 %out
+}
+
+define i64 @cntw_mul4() {
+; CHECK-LABEL: @cntw_mul4(
+; CHECK-NEXT:[[CNT:%.*]] = call i64 @llvm.aarch64.sve.cntw(i32 9)
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[CNT]], 2
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntw(i32 9)
+  %out = mul i64 %cnt, 4
+  ret i64 %out
+}
+
+define i64 @cntw_all() {
+; CHECK-LABEL: @cntw_all(
+; CHECK-NEXT:[[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[TMP1]], 2
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 31)
+  ret i64 %out
+}
+
+
+;
+; CNTD
+;
+
+define i64 @cntd_vl1() {
+; CHECK-LABEL: @cntd_vl1(
+; CHECK-NEXT:ret i64 1
+;
+  %out = call i64 @llvm.aarch64.sve.cntd(i32 1)
+  ret i64 %out
+}
+
+define i64 @cntd_vl2() {
+; CHE

[PATCH] D105097: [clang][AArch64][SVE] Handle PRValue under VLAT <-> VLST cast

2021-06-29 Thread JunMa via Phabricator via cfe-commits

junparser created this revision.
junparser added reviewers: joechrisellis, c-rhodes, efriedma, aeubanks, bsmith.
Herald added subscribers: psnobl, kristof.beyls, tschuett.
junparser requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

This change fixes the crash that PRValue cannot be handled by
EmitLValue.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D105097

Files:
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c


Index: clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
===
--- clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
+++ clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
@@ -103,3 +103,46 @@
   parr = &arr[0];
   return *parr;
 }
+
+// CHECK-LABEL: @test_cast(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <16 x i32>, align 16
+// CHECK-NEXT:[[PRED_ADDR:%.*]] = alloca , align 2
+// CHECK-NEXT:[[VEC_ADDR:%.*]] = alloca , align 16
+// CHECK-NEXT:[[XX:%.*]] = alloca <16 x i32>, align 16
+// CHECK-NEXT:[[YY:%.*]] = alloca <16 x i32>, align 16
+// CHECK-NEXT:[[PG:%.*]] = alloca , align 2
+// CHECK-NEXT:[[SAVED_PRVALUE:%.*]] = alloca <16 x i32>, align 64
+// CHECK-NEXT:store  [[PRED:%.*]], * 
[[PRED_ADDR]], align 2
+// CHECK-NEXT:store  [[VEC:%.*]], * 
[[VEC_ADDR]], align 16
+// CHECK-NEXT:store <16 x i32> , <16 x 
i32>* [[XX]], align 16
+// CHECK-NEXT:store <16 x i32> , <16 x 
i32>* [[YY]], align 16
+// CHECK-NEXT:[[TMP0:%.*]] = load , * 
[[PRED_ADDR]], align 2
+// CHECK-NEXT:[[TMP1:%.*]] = load <8 x i8>, <8 x i8>* @global_pred, align 2
+// CHECK-NEXT:[[TMP2:%.*]] = load , * 
bitcast (<8 x i8>* @global_pred to *), align 2
+// CHECK-NEXT:[[TMP3:%.*]] = load <16 x i32>, <16 x i32>* [[XX]], align 16
+// CHECK-NEXT:[[TMP4:%.*]] = load <16 x i32>, <16 x i32>* [[YY]], align 16
+// CHECK-NEXT:[[ADD:%.*]] = add <16 x i32> [[TMP3]], [[TMP4]]
+// CHECK-NEXT:store <16 x i32> [[ADD]], <16 x i32>* [[SAVED_PRVALUE]], 
align 64
+// CHECK-NEXT:[[TMP5:%.*]] = bitcast <16 x i32>* [[SAVED_PRVALUE]] to 
*
+// CHECK-NEXT:[[TMP6:%.*]] = load , * 
[[TMP5]], align 64
+// CHECK-NEXT:[[TMP7:%.*]] = call  
@llvm.aarch64.sve.and.z.nxv16i1( [[TMP0]],  
[[TMP2]],  [[TMP6]])
+// CHECK-NEXT:store  [[TMP7]], * 
[[PG]], align 2
+// CHECK-NEXT:[[TMP8:%.*]] = load , * 
[[PG]], align 2
+// CHECK-NEXT:[[TMP9:%.*]] = load <16 x i32>, <16 x i32>* @global_vec, 
align 16
+// CHECK-NEXT:[[CASTSCALABLESVE:%.*]] = call  
@llvm.experimental.vector.insert.nxv4i32.v16i32( undef, <16 x 
i32> [[TMP9]], i64 0)
+// CHECK-NEXT:[[TMP10:%.*]] = load , * 
[[VEC_ADDR]], align 16
+// CHECK-NEXT:[[TMP11:%.*]] = call  
@llvm.aarch64.sve.convert.from.svbool.nxv4i1( [[TMP8]])
+// CHECK-NEXT:[[TMP12:%.*]] = call  
@llvm.aarch64.sve.add.nxv4i32( [[TMP11]],  
[[CASTSCALABLESVE]],  [[TMP10]])
+// CHECK-NEXT:[[CASTFIXEDSVE:%.*]] = call <16 x i32> 
@llvm.experimental.vector.extract.v16i32.nxv4i32( [[TMP12]], 
i64 0)
+// CHECK-NEXT:store <16 x i32> [[CASTFIXEDSVE]], <16 x i32>* [[RETVAL]], 
align 16
+// CHECK-NEXT:[[TMP13:%.*]] = load <16 x i32>, <16 x i32>* [[RETVAL]], 
align 16
+// CHECK-NEXT:[[CASTSCALABLESVE1:%.*]] = call  
@llvm.experimental.vector.insert.nxv4i32.v16i32( undef, <16 x 
i32> [[TMP13]], i64 0)
+// CHECK-NEXT:ret  [[CASTSCALABLESVE1]]
+//
+fixed_int32_t test_cast(svbool_t pred, svint32_t vec) {
+  fixed_int32_t xx = {1, 2, 3, 4};
+  fixed_int32_t yy = {2, 5, 4, 6};
+  svbool_t pg = svand_z(pred, global_pred, xx + yy);
+  return svadd_m(pg, global_vec, vec);
+}
Index: clang/lib/CodeGen/CGExprScalar.cpp
===
--- clang/lib/CodeGen/CGExprScalar.cpp
+++ clang/lib/CodeGen/CGExprScalar.cpp
@@ -2111,7 +2111,13 @@
 return EmitLoadOfLValue(DestLV, CE->getExprLoc());
   }
 
-  Address Addr = EmitLValue(E).getAddress(CGF);
+  Address Addr = Address::invalid();
+  if (E->isPRValue() && !isa(E)) {
+Addr = CGF.CreateDefaultAlignTempAlloca(SrcTy, "saved-prvalue");
+LValue LV = CGF.MakeAddrLValue(Addr, E->getType());
+CGF.EmitStoreOfScalar(Src, LV);
+  } else
+Addr = EmitLValue(E).getAddress(CGF);
   Addr = Builder.CreateElementBitCast(Addr, CGF.ConvertTypeForMem(DestTy));
   LValue DestLV = CGF.MakeAddrLValue(Addr, DestTy);
   DestLV.setTBAAInfo(TBAAAccessInfo::getMayAliasInfo());


Index: clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
===
--- clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
+++ clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
@@ -103,3 +103,46 @@
   parr = &arr[0];
   return *parr;
 }
+
+// CHECK-LABEL: @test_cast(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca <16 x i32>, align

[PATCH] D105097: [clang][AArch64][SVE] Handle PRValue under VLAT <-> VLST cast

2021-06-29 Thread JunMa via Phabricator via cfe-commits

junparser added inline comments.



Comment at: clang/lib/CodeGen/CGExprScalar.cpp:2120
+  } else
+Addr = EmitLValue(E).getAddress(CGF);
   Addr = Builder.CreateElementBitCast(Addr, CGF.ConvertTypeForMem(DestTy));

efriedma wrote:
> I don't think it's legal to use EmitLValue here at all; the emitted IR could 
> have side-effects.
> 
> Since we have the vector insert/extract intrinsics now, can we just use them 
> here instead of going through the load/store dance?
> I don't think it's legal to use EmitLValue here at all; the emitted IR could 
> have side-effects.
> 
I agree since we have already visited E. 

> Since we have the vector insert/extract intrinsics now, can we just use them 
> here instead of going through the load/store dance?

we have already use insert/extract intrinsics for same element type, we can 
only handle predicate cast through memory.
One of idea here is always use store + load. what do you think?



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105097/new/

https://reviews.llvm.org/D105097

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D105097: [clang][AArch64][SVE] Handle PRValue under VLAT <-> VLST cast

2021-06-29 Thread JunMa via Phabricator via cfe-commits

junparser added inline comments.



Comment at: clang/lib/CodeGen/CGExprScalar.cpp:2120
+  } else
+Addr = EmitLValue(E).getAddress(CGF);
   Addr = Builder.CreateElementBitCast(Addr, CGF.ConvertTypeForMem(DestTy));

junparser wrote:
> efriedma wrote:
> > I don't think it's legal to use EmitLValue here at all; the emitted IR 
> > could have side-effects.
> > 
> > Since we have the vector insert/extract intrinsics now, can we just use 
> > them here instead of going through the load/store dance?
> > I don't think it's legal to use EmitLValue here at all; the emitted IR 
> > could have side-effects.
> > 
> I agree since we have already visited E. 
> 
> > Since we have the vector insert/extract intrinsics now, can we just use 
> > them here instead of going through the load/store dance?
> 
> we have already use insert/extract intrinsics for same element type, we can 
> only handle predicate cast through memory.
> One of idea here is always use store + load. what do you think?
> 
@efriedma I'm also working on a patch that optimize such store + bitcast + load 
pattern with constant vector, so maybe it is ok to always use alloca + load


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105097/new/

https://reviews.llvm.org/D105097

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D105097: [clang][AArch64][SVE] Handle PRValue under VLAT <-> VLST cast

2021-06-29 Thread JunMa via Phabricator via cfe-commits

junparser added inline comments.



Comment at: clang/lib/CodeGen/CGExprScalar.cpp:2120
+  } else
+Addr = EmitLValue(E).getAddress(CGF);
   Addr = Builder.CreateElementBitCast(Addr, CGF.ConvertTypeForMem(DestTy));

efriedma wrote:
> junparser wrote:
> > junparser wrote:
> > > efriedma wrote:
> > > > I don't think it's legal to use EmitLValue here at all; the emitted IR 
> > > > could have side-effects.
> > > > 
> > > > Since we have the vector insert/extract intrinsics now, can we just use 
> > > > them here instead of going through the load/store dance?
> > > > I don't think it's legal to use EmitLValue here at all; the emitted IR 
> > > > could have side-effects.
> > > > 
> > > I agree since we have already visited E. 
> > > 
> > > > Since we have the vector insert/extract intrinsics now, can we just use 
> > > > them here instead of going through the load/store dance?
> > > 
> > > we have already use insert/extract intrinsics for same element type, we 
> > > can only handle predicate cast through memory.
> > > One of idea here is always use store + load. what do you think?
> > > 
> > @efriedma I'm also working on a patch that optimize such store + bitcast + 
> > load pattern with constant vector, so maybe it is ok to always use alloca + 
> > load
> I'd be happy to accept just unconditionally doing the alloca+store+load thing 
> for now.
> 
> Not sure I understand why predicates are special here.  Even if we can't 
> handle predicates directly in insert/extract intrinsics, we can always just 
> zero-extend to a bigger integer type, do the cast, then truncate the result.
I'm not sure whether zext+cast+trunc to vxi1 can generate better code. I think 
it is better to handle this in llvm ir without extend and truncate, only 
bitcast and insert/extract 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105097/new/

https://reviews.llvm.org/D105097

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D105097: [clang][AArch64][SVE] Handle PRValue under VLAT <-> VLST cast

2021-06-30 Thread JunMa via Phabricator via cfe-commits

junparser updated this revision to Diff 355460.
junparser added a comment.

address comment.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105097/new/

https://reviews.llvm.org/D105097

Files:
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c

Index: clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
===
--- clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
+++ clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
@@ -22,13 +22,13 @@
 // CHECK-128-LABEL: @write_global_i64(
 // CHECK-128-NEXT:  entry:
 // CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = call <2 x i64> @llvm.experimental.vector.extract.v2i64.nxv2i64( [[V:%.*]], i64 0)
-// CHECK-128-NEXT:store <2 x i64> [[CASTFIXEDSVE]], <2 x i64>* @global_i64, align 16, [[TBAA6:!tbaa !.*]]
+// CHECK-128-NEXT:store <2 x i64> [[CASTFIXEDSVE]], <2 x i64>* @global_i64, align 16, !tbaa [[TBAA6:![0-9]+]]
 // CHECK-128-NEXT:ret void
 //
 // CHECK-512-LABEL: @write_global_i64(
 // CHECK-512-NEXT:  entry:
 // CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = call <8 x i64> @llvm.experimental.vector.extract.v8i64.nxv2i64( [[V:%.*]], i64 0)
-// CHECK-512-NEXT:store <8 x i64> [[CASTFIXEDSVE]], <8 x i64>* @global_i64, align 16, [[TBAA6:!tbaa !.*]]
+// CHECK-512-NEXT:store <8 x i64> [[CASTFIXEDSVE]], <8 x i64>* @global_i64, align 16, !tbaa [[TBAA6:![0-9]+]]
 // CHECK-512-NEXT:ret void
 //
 void write_global_i64(svint64_t v) { global_i64 = v; }
@@ -36,33 +36,33 @@
 // CHECK-128-LABEL: @write_global_bf16(
 // CHECK-128-NEXT:  entry:
 // CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = call <8 x bfloat> @llvm.experimental.vector.extract.v8bf16.nxv8bf16( [[V:%.*]], i64 0)
-// CHECK-128-NEXT:store <8 x bfloat> [[CASTFIXEDSVE]], <8 x bfloat>* @global_bf16, align 16, [[TBAA6]]
+// CHECK-128-NEXT:store <8 x bfloat> [[CASTFIXEDSVE]], <8 x bfloat>* @global_bf16, align 16, !tbaa [[TBAA6]]
 // CHECK-128-NEXT:ret void
 //
 // CHECK-512-LABEL: @write_global_bf16(
 // CHECK-512-NEXT:  entry:
 // CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = call <32 x bfloat> @llvm.experimental.vector.extract.v32bf16.nxv8bf16( [[V:%.*]], i64 0)
-// CHECK-512-NEXT:store <32 x bfloat> [[CASTFIXEDSVE]], <32 x bfloat>* @global_bf16, align 16, [[TBAA6]]
+// CHECK-512-NEXT:store <32 x bfloat> [[CASTFIXEDSVE]], <32 x bfloat>* @global_bf16, align 16, !tbaa [[TBAA6]]
 // CHECK-512-NEXT:ret void
 //
 void write_global_bf16(svbfloat16_t v) { global_bf16 = v; }
 
 // CHECK-128-LABEL: @write_global_bool(
 // CHECK-128-NEXT:  entry:
-// CHECK-128-NEXT:[[V_ADDR:%.*]] = alloca , align 16
-// CHECK-128-NEXT:store  [[V:%.*]], * [[V_ADDR]], align 16, [[TBAA9:!tbaa !.*]]
-// CHECK-128-NEXT:[[TMP0:%.*]] = bitcast * [[V_ADDR]] to <2 x i8>*
-// CHECK-128-NEXT:[[TMP1:%.*]] = load <2 x i8>, <2 x i8>* [[TMP0]], align 16, [[TBAA6]]
-// CHECK-128-NEXT:store <2 x i8> [[TMP1]], <2 x i8>* @global_bool, align 2, [[TBAA6]]
+// CHECK-128-NEXT:[[SAVED_VALUE:%.*]] = alloca , align 16
+// CHECK-128-NEXT:store  [[V:%.*]], * [[SAVED_VALUE]], align 16, !tbaa [[TBAA9:![0-9]+]]
+// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast * [[SAVED_VALUE]] to <2 x i8>*
+// CHECK-128-NEXT:[[TMP0:%.*]] = load <2 x i8>, <2 x i8>* [[CASTFIXEDSVE]], align 16, !tbaa [[TBAA6]]
+// CHECK-128-NEXT:store <2 x i8> [[TMP0]], <2 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
 // CHECK-128-NEXT:ret void
 //
 // CHECK-512-LABEL: @write_global_bool(
 // CHECK-512-NEXT:  entry:
-// CHECK-512-NEXT:[[V_ADDR:%.*]] = alloca , align 16
-// CHECK-512-NEXT:store  [[V:%.*]], * [[V_ADDR]], align 16, [[TBAA9:!tbaa !.*]]
-// CHECK-512-NEXT:[[TMP0:%.*]] = bitcast * [[V_ADDR]] to <8 x i8>*
-// CHECK-512-NEXT:[[TMP1:%.*]] = load <8 x i8>, <8 x i8>* [[TMP0]], align 16, [[TBAA6]]
-// CHECK-512-NEXT:store <8 x i8> [[TMP1]], <8 x i8>* @global_bool, align 2, [[TBAA6]]
+// CHECK-512-NEXT:[[SAVED_VALUE:%.*]] = alloca , align 16
+// CHECK-512-NEXT:store  [[V:%.*]], * [[SAVED_VALUE]], align 16, !tbaa [[TBAA9:![0-9]+]]
+// CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast * [[SAVED_VALUE]] to <8 x i8>*
+// CHECK-512-NEXT:[[TMP0:%.*]] = load <8 x i8>, <8 x i8>* [[CASTFIXEDSVE]], align 16, !tbaa [[TBAA6]]
+// CHECK-512-NEXT:store <8 x i8> [[TMP0]], <8 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
 // CHECK-512-NEXT:ret void
 //
 void write_global_bool(svbool_t v) { global_bool = v; }
@@ -73,13 +73,13 @@
 
 // CHECK-128-LABEL: @read_global_i64(
 // CHECK-128-NEXT:  entry:
-// CHECK-128-NEXT:[[TMP0:%.*]] = load <2 x i64>, <2 x i64>* @global_i64, align 16, [[TBAA6]]
+// CHECK-128-NEXT:[[TMP0:%.*]] = load <2 x i64>, <2 x i64>* @global_i64, align 16, !tbaa [[TBAA6]]
 //

[PATCH] D105097: [clang][AArch64][SVE] Handle PRValue under VLAT <-> VLST cast

2021-06-30 Thread JunMa via Phabricator via cfe-commits

junparser added inline comments.



Comment at: clang/lib/CodeGen/CGExprScalar.cpp:2103
+  if (const CallExpr *CE = dyn_cast(E))
+Ty = CE->getCallReturnType(CGF.getContext());
+

efriedma wrote:
> I don't think we need to call getCallReturnType() here.  A call that returns 
> a reference is an lvalue, and the code here expects an rvalue.  So 
> CE->getCallReturnType() is going to be the same as E->getType().
make sense to me, thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105097/new/

https://reviews.llvm.org/D105097

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D105097: [clang][AArch64][SVE] Handle PRValue under VLAT <-> VLST cast

2021-06-30 Thread JunMa via Phabricator via cfe-commits

junparser updated this revision to Diff 355484.
junparser added a comment.

address comment


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105097/new/

https://reviews.llvm.org/D105097

Files:
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c

Index: clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
===
--- clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
+++ clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
@@ -22,13 +22,13 @@
 // CHECK-128-LABEL: @write_global_i64(
 // CHECK-128-NEXT:  entry:
 // CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = call <2 x i64> @llvm.experimental.vector.extract.v2i64.nxv2i64( [[V:%.*]], i64 0)
-// CHECK-128-NEXT:store <2 x i64> [[CASTFIXEDSVE]], <2 x i64>* @global_i64, align 16, [[TBAA6:!tbaa !.*]]
+// CHECK-128-NEXT:store <2 x i64> [[CASTFIXEDSVE]], <2 x i64>* @global_i64, align 16, !tbaa [[TBAA6:![0-9]+]]
 // CHECK-128-NEXT:ret void
 //
 // CHECK-512-LABEL: @write_global_i64(
 // CHECK-512-NEXT:  entry:
 // CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = call <8 x i64> @llvm.experimental.vector.extract.v8i64.nxv2i64( [[V:%.*]], i64 0)
-// CHECK-512-NEXT:store <8 x i64> [[CASTFIXEDSVE]], <8 x i64>* @global_i64, align 16, [[TBAA6:!tbaa !.*]]
+// CHECK-512-NEXT:store <8 x i64> [[CASTFIXEDSVE]], <8 x i64>* @global_i64, align 16, !tbaa [[TBAA6:![0-9]+]]
 // CHECK-512-NEXT:ret void
 //
 void write_global_i64(svint64_t v) { global_i64 = v; }
@@ -36,33 +36,33 @@
 // CHECK-128-LABEL: @write_global_bf16(
 // CHECK-128-NEXT:  entry:
 // CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = call <8 x bfloat> @llvm.experimental.vector.extract.v8bf16.nxv8bf16( [[V:%.*]], i64 0)
-// CHECK-128-NEXT:store <8 x bfloat> [[CASTFIXEDSVE]], <8 x bfloat>* @global_bf16, align 16, [[TBAA6]]
+// CHECK-128-NEXT:store <8 x bfloat> [[CASTFIXEDSVE]], <8 x bfloat>* @global_bf16, align 16, !tbaa [[TBAA6]]
 // CHECK-128-NEXT:ret void
 //
 // CHECK-512-LABEL: @write_global_bf16(
 // CHECK-512-NEXT:  entry:
 // CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = call <32 x bfloat> @llvm.experimental.vector.extract.v32bf16.nxv8bf16( [[V:%.*]], i64 0)
-// CHECK-512-NEXT:store <32 x bfloat> [[CASTFIXEDSVE]], <32 x bfloat>* @global_bf16, align 16, [[TBAA6]]
+// CHECK-512-NEXT:store <32 x bfloat> [[CASTFIXEDSVE]], <32 x bfloat>* @global_bf16, align 16, !tbaa [[TBAA6]]
 // CHECK-512-NEXT:ret void
 //
 void write_global_bf16(svbfloat16_t v) { global_bf16 = v; }
 
 // CHECK-128-LABEL: @write_global_bool(
 // CHECK-128-NEXT:  entry:
-// CHECK-128-NEXT:[[V_ADDR:%.*]] = alloca , align 16
-// CHECK-128-NEXT:store  [[V:%.*]], * [[V_ADDR]], align 16, [[TBAA9:!tbaa !.*]]
-// CHECK-128-NEXT:[[TMP0:%.*]] = bitcast * [[V_ADDR]] to <2 x i8>*
-// CHECK-128-NEXT:[[TMP1:%.*]] = load <2 x i8>, <2 x i8>* [[TMP0]], align 16, [[TBAA6]]
-// CHECK-128-NEXT:store <2 x i8> [[TMP1]], <2 x i8>* @global_bool, align 2, [[TBAA6]]
+// CHECK-128-NEXT:[[SAVED_VALUE:%.*]] = alloca , align 16
+// CHECK-128-NEXT:store  [[V:%.*]], * [[SAVED_VALUE]], align 16, !tbaa [[TBAA9:![0-9]+]]
+// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast * [[SAVED_VALUE]] to <2 x i8>*
+// CHECK-128-NEXT:[[TMP0:%.*]] = load <2 x i8>, <2 x i8>* [[CASTFIXEDSVE]], align 16, !tbaa [[TBAA6]]
+// CHECK-128-NEXT:store <2 x i8> [[TMP0]], <2 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
 // CHECK-128-NEXT:ret void
 //
 // CHECK-512-LABEL: @write_global_bool(
 // CHECK-512-NEXT:  entry:
-// CHECK-512-NEXT:[[V_ADDR:%.*]] = alloca , align 16
-// CHECK-512-NEXT:store  [[V:%.*]], * [[V_ADDR]], align 16, [[TBAA9:!tbaa !.*]]
-// CHECK-512-NEXT:[[TMP0:%.*]] = bitcast * [[V_ADDR]] to <8 x i8>*
-// CHECK-512-NEXT:[[TMP1:%.*]] = load <8 x i8>, <8 x i8>* [[TMP0]], align 16, [[TBAA6]]
-// CHECK-512-NEXT:store <8 x i8> [[TMP1]], <8 x i8>* @global_bool, align 2, [[TBAA6]]
+// CHECK-512-NEXT:[[SAVED_VALUE:%.*]] = alloca , align 16
+// CHECK-512-NEXT:store  [[V:%.*]], * [[SAVED_VALUE]], align 16, !tbaa [[TBAA9:![0-9]+]]
+// CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast * [[SAVED_VALUE]] to <8 x i8>*
+// CHECK-512-NEXT:[[TMP0:%.*]] = load <8 x i8>, <8 x i8>* [[CASTFIXEDSVE]], align 16, !tbaa [[TBAA6]]
+// CHECK-512-NEXT:store <8 x i8> [[TMP0]], <8 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
 // CHECK-512-NEXT:ret void
 //
 void write_global_bool(svbool_t v) { global_bool = v; }
@@ -73,13 +73,13 @@
 
 // CHECK-128-LABEL: @read_global_i64(
 // CHECK-128-NEXT:  entry:
-// CHECK-128-NEXT:[[TMP0:%.*]] = load <2 x i64>, <2 x i64>* @global_i64, align 16, [[TBAA6]]
+// CHECK-128-NEXT:[[TMP0:%.*]] = load <2 x i64>, <2 x i64>* @global_i64, align 16, !tbaa [[TBAA6]]
 // C

[PATCH] D104852: [AArch64][SVEIntrinsicOpts] Convect cntb/h/w/d to vscale intrinsic or constant.

2021-06-30 Thread JunMa via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGae5433945f91: [AArch64][SVEIntrinsicOpts] Convect cntb/h/w/d 
to vscale intrinsic or constant. (authored by junparser).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104852/new/

https://reviews.llvm.org/D104852

Files:
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntb.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntd.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cnth.c
  clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_cntw.c
  llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
  llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-counting-elems.ll

Index: llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-counting-elems.ll
===
--- /dev/null
+++ llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-opts-counting-elems.ll
@@ -0,0 +1,247 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -S -instcombine < %s | FileCheck %s
+
+target triple = "aarch64-unknown-linux-gnu"
+
+;
+; CNTB
+;
+
+define i64 @cntb_vl1() {
+; CHECK-LABEL: @cntb_vl1(
+; CHECK-NEXT:ret i64 1
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 1)
+  ret i64 %out
+}
+
+define i64 @cntb_vl2() {
+; CHECK-LABEL: @cntb_vl2(
+; CHECK-NEXT:ret i64 2
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 2)
+  ret i64 %out
+}
+
+define i64 @cntb_vl4() {
+; CHECK-LABEL: @cntb_vl4(
+; CHECK-NEXT:ret i64 4
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 4)
+  ret i64 %out
+}
+
+define i64 @cntb_mul3() {
+; CHECK-LABEL: @cntb_mul3(
+; CHECK-NEXT:ret i64 24
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntb(i32 8)
+  %out = mul i64 %cnt, 3
+  ret i64 %out
+}
+
+define i64 @cntb_mul4() {
+; CHECK-LABEL: @cntb_mul4(
+; CHECK-NEXT:ret i64 64
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntb(i32 9)
+  %out = mul i64 %cnt, 4
+  ret i64 %out
+}
+
+define i64 @cntb_all() {
+; CHECK-LABEL: @cntb_all(
+; CHECK-NEXT:[[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[TMP1]], 4
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %out = call i64 @llvm.aarch64.sve.cntb(i32 31)
+  ret i64 %out
+}
+
+;
+; CNTH
+;
+
+define i64 @cnth_vl1() {
+; CHECK-LABEL: @cnth_vl1(
+; CHECK-NEXT:ret i64 1
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 1)
+  ret i64 %out
+}
+
+define i64 @cnth_vl2() {
+; CHECK-LABEL: @cnth_vl2(
+; CHECK-NEXT:ret i64 2
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 2)
+  ret i64 %out
+}
+
+define i64 @cnth_vl4() {
+; CHECK-LABEL: @cnth_vl4(
+; CHECK-NEXT:ret i64 4
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 4)
+  ret i64 %out
+}
+
+define i64 @cnth_mul3() {
+; CHECK-LABEL: @cnth_mul3(
+; CHECK-NEXT:ret i64 24
+;
+  %cnt = call i64 @llvm.aarch64.sve.cnth(i32 8)
+  %out = mul i64 %cnt, 3
+  ret i64 %out
+}
+
+define i64 @cnth_mul4() {
+; CHECK-LABEL: @cnth_mul4(
+; CHECK-NEXT:[[CNT:%.*]] = call i64 @llvm.aarch64.sve.cnth(i32 9)
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[CNT]], 2
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %cnt = call i64 @llvm.aarch64.sve.cnth(i32 9)
+  %out = mul i64 %cnt, 4
+  ret i64 %out
+}
+
+define i64 @cnth_all() {
+; CHECK-LABEL: @cnth_all(
+; CHECK-NEXT:[[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[TMP1]], 3
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %out = call i64 @llvm.aarch64.sve.cnth(i32 31)
+  ret i64 %out
+}
+
+;
+; CNTW
+;
+
+define i64 @cntw_vl1() {
+; CHECK-LABEL: @cntw_vl1(
+; CHECK-NEXT:ret i64 1
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 1)
+  ret i64 %out
+}
+
+define i64 @cntw_vl2() {
+; CHECK-LABEL: @cntw_vl2(
+; CHECK-NEXT:ret i64 2
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 2)
+  ret i64 %out
+}
+
+define i64 @cntw_vl4() {
+; CHECK-LABEL: @cntw_vl4(
+; CHECK-NEXT:ret i64 4
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 4)
+  ret i64 %out
+}
+
+define i64 @cntw_mul3() {
+; CHECK-LABEL: @cntw_mul3(
+; CHECK-NEXT:[[CNT:%.*]] = call i64 @llvm.aarch64.sve.cntw(i32 8)
+; CHECK-NEXT:[[OUT:%.*]] = mul i64 [[CNT]], 3
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntw(i32 8)
+  %out = mul i64 %cnt, 3
+  ret i64 %out
+}
+
+define i64 @cntw_mul4() {
+; CHECK-LABEL: @cntw_mul4(
+; CHECK-NEXT:[[CNT:%.*]] = call i64 @llvm.aarch64.sve.cntw(i32 9)
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[CNT]], 2
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %cnt = call i64 @llvm.aarch64.sve.cntw(i32 9)
+  %out = mul i64 %cnt, 4
+  ret i64 %out
+}
+
+define i64 @cntw_all() {
+; CHECK-LABEL: @cntw_all(
+; CHECK-NEXT:[[TMP1:%.*]] = call i64 @llvm.vscale.i64()
+; CHECK-NEXT:[[OUT:%.*]] = shl i64 [[TMP1]], 2
+; CHECK-NEXT:ret i64 [[OUT]]
+;
+  %out = call i64 @llvm.aarch64.sve.cntw(i32 31)
+  ret i64 %out
+}
+
+
+;
+; CNTD
+;
+
+define i64 @cntd_vl1() {
+; CHECK-LABEL: @cntd_vl1(
+; CHECK-NEXT:

[PATCH] D105097: [clang][AArch64][SVE] Handle PRValue under VLAT <-> VLST cast

2021-06-30 Thread JunMa via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG3afbf898044a: [clang][AArch64][SVE] Handle PRValue under 
VLAT <-> VLST cast (authored by junparser).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105097/new/

https://reviews.llvm.org/D105097

Files:
  clang/lib/CodeGen/CGExprScalar.cpp
  clang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-call.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
  clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c

Index: clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
===
--- clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
+++ clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
@@ -22,13 +22,13 @@
 // CHECK-128-LABEL: @write_global_i64(
 // CHECK-128-NEXT:  entry:
 // CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = call <2 x i64> @llvm.experimental.vector.extract.v2i64.nxv2i64( [[V:%.*]], i64 0)
-// CHECK-128-NEXT:store <2 x i64> [[CASTFIXEDSVE]], <2 x i64>* @global_i64, align 16, [[TBAA6:!tbaa !.*]]
+// CHECK-128-NEXT:store <2 x i64> [[CASTFIXEDSVE]], <2 x i64>* @global_i64, align 16, !tbaa [[TBAA6:![0-9]+]]
 // CHECK-128-NEXT:ret void
 //
 // CHECK-512-LABEL: @write_global_i64(
 // CHECK-512-NEXT:  entry:
 // CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = call <8 x i64> @llvm.experimental.vector.extract.v8i64.nxv2i64( [[V:%.*]], i64 0)
-// CHECK-512-NEXT:store <8 x i64> [[CASTFIXEDSVE]], <8 x i64>* @global_i64, align 16, [[TBAA6:!tbaa !.*]]
+// CHECK-512-NEXT:store <8 x i64> [[CASTFIXEDSVE]], <8 x i64>* @global_i64, align 16, !tbaa [[TBAA6:![0-9]+]]
 // CHECK-512-NEXT:ret void
 //
 void write_global_i64(svint64_t v) { global_i64 = v; }
@@ -36,33 +36,33 @@
 // CHECK-128-LABEL: @write_global_bf16(
 // CHECK-128-NEXT:  entry:
 // CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = call <8 x bfloat> @llvm.experimental.vector.extract.v8bf16.nxv8bf16( [[V:%.*]], i64 0)
-// CHECK-128-NEXT:store <8 x bfloat> [[CASTFIXEDSVE]], <8 x bfloat>* @global_bf16, align 16, [[TBAA6]]
+// CHECK-128-NEXT:store <8 x bfloat> [[CASTFIXEDSVE]], <8 x bfloat>* @global_bf16, align 16, !tbaa [[TBAA6]]
 // CHECK-128-NEXT:ret void
 //
 // CHECK-512-LABEL: @write_global_bf16(
 // CHECK-512-NEXT:  entry:
 // CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = call <32 x bfloat> @llvm.experimental.vector.extract.v32bf16.nxv8bf16( [[V:%.*]], i64 0)
-// CHECK-512-NEXT:store <32 x bfloat> [[CASTFIXEDSVE]], <32 x bfloat>* @global_bf16, align 16, [[TBAA6]]
+// CHECK-512-NEXT:store <32 x bfloat> [[CASTFIXEDSVE]], <32 x bfloat>* @global_bf16, align 16, !tbaa [[TBAA6]]
 // CHECK-512-NEXT:ret void
 //
 void write_global_bf16(svbfloat16_t v) { global_bf16 = v; }
 
 // CHECK-128-LABEL: @write_global_bool(
 // CHECK-128-NEXT:  entry:
-// CHECK-128-NEXT:[[V_ADDR:%.*]] = alloca , align 16
-// CHECK-128-NEXT:store  [[V:%.*]], * [[V_ADDR]], align 16, [[TBAA9:!tbaa !.*]]
-// CHECK-128-NEXT:[[TMP0:%.*]] = bitcast * [[V_ADDR]] to <2 x i8>*
-// CHECK-128-NEXT:[[TMP1:%.*]] = load <2 x i8>, <2 x i8>* [[TMP0]], align 16, [[TBAA6]]
-// CHECK-128-NEXT:store <2 x i8> [[TMP1]], <2 x i8>* @global_bool, align 2, [[TBAA6]]
+// CHECK-128-NEXT:[[SAVED_VALUE:%.*]] = alloca , align 16
+// CHECK-128-NEXT:store  [[V:%.*]], * [[SAVED_VALUE]], align 16, !tbaa [[TBAA9:![0-9]+]]
+// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast * [[SAVED_VALUE]] to <2 x i8>*
+// CHECK-128-NEXT:[[TMP0:%.*]] = load <2 x i8>, <2 x i8>* [[CASTFIXEDSVE]], align 16, !tbaa [[TBAA6]]
+// CHECK-128-NEXT:store <2 x i8> [[TMP0]], <2 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
 // CHECK-128-NEXT:ret void
 //
 // CHECK-512-LABEL: @write_global_bool(
 // CHECK-512-NEXT:  entry:
-// CHECK-512-NEXT:[[V_ADDR:%.*]] = alloca , align 16
-// CHECK-512-NEXT:store  [[V:%.*]], * [[V_ADDR]], align 16, [[TBAA9:!tbaa !.*]]
-// CHECK-512-NEXT:[[TMP0:%.*]] = bitcast * [[V_ADDR]] to <8 x i8>*
-// CHECK-512-NEXT:[[TMP1:%.*]] = load <8 x i8>, <8 x i8>* [[TMP0]], align 16, [[TBAA6]]
-// CHECK-512-NEXT:store <8 x i8> [[TMP1]], <8 x i8>* @global_bool, align 2, [[TBAA6]]
+// CHECK-512-NEXT:[[SAVED_VALUE:%.*]] = alloca , align 16
+// CHECK-512-NEXT:store  [[V:%.*]], * [[SAVED_VALUE]], align 16, !tbaa [[TBAA9:![0-9]+]]
+// CHECK-512-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast * [[SAVED_VALUE]] to <8 x i8>*
+// CHECK-512-NEXT:[[TMP0:%.*]] = load <8 x i8>, <8 x i8>* [[CASTFIXEDSVE]], align 16, !tbaa [[TBAA6]]
+// CHECK-512-NEXT:store <8 x i8> [[TMP0]], <8 x i8>* @global_bool, align 2, !tbaa [[TBAA6]]
 // CHECK-512-NEXT:ret void
 //
 void write_global_bool(svbool_t v) { global_bool = v; }
@@ -73,13 +73,13 @@
 
 // CHECK-128-LABEL: @read_global_i64(
 // CHECK-128-NEXT:  entry:
-// CHECK-128-NEXT:[[TMP0:%.*]] = load <2 x i64>, <2 x i64>* @global_i64, align 16, [[TBAA6]]
+// CHE

[PATCH] D105097: [clang][AArch64][SVE] Handle PRValue under VLAT <-> VLST cast

2021-06-30 Thread JunMa via Phabricator via cfe-commits

junparser added inline comments.



Comment at: clang/test/CodeGen/attr-arm-sve-vector-bits-globals.c:108
+// CHECK-128-NEXT:[[CASTFIXEDSVE:%.*]] = bitcast <2 x i8>* [[SAVED_VALUE]] 
to *
+// CHECK-128-NEXT:[[TMP1:%.*]] = load , * [[CASTFIXEDSVE]], align 16, !tbaa [[TBAA6]]
+// CHECK-128-NEXT:ret  [[TMP1]]

efriedma wrote:
> Oh, hmm, this is the case where we can't optimize.
> 
> We could probably teach instcombine to convert this pattern into a load 
> directly from the global, if it matters.
yep, we can also convert to bitcast + vector.insert when without vscale_range . 
I'll check the codegen of rvv see whether it has any difference.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105097/new/

https://reviews.llvm.org/D105097

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[Diffusion] rGc93f93b2e3f2: Revert "Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when…

2021-11-23 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

In rGc93f93b2e3f28997f794265089fb8138dd5b5f13#1040054 
, 
@lkail wrote:

> Looks a more general way should be implemented in tailduplicator to avoid 
> adding quadratic edges in CFGs.

yep， first I'll revert this.


BRANCHES
  EmptyLineAfterFunctionDefinition, fix_asan, main

Users:
  junparser (Author)

https://reviews.llvm.org/rGc93f93b2e3f2

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[Diffusion] rGc93f93b2e3f2: Revert "Revert "Recommit "Revert "[CVP] processSwitch: Remove default case when…

2021-11-23 Thread JunMa via Phabricator via cfe-commits

junparser added a reverting change: rG07333810caee: Revert "Revert "Revert 
"Recommit "Revert "[CVP] processSwitch: Remove default….

BRANCHES
  EmptyLineAfterFunctionDefinition, fix_asan, main

Users:
  junparser (Author)

https://reviews.llvm.org/rGc93f93b2e3f2

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D70555: [coroutines] Don't build promise init with no args

2020-03-22 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70555/new/

https://reviews.llvm.org/D70555



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76119: [Coroutines] Insert lifetime intrinsics even O0 is used

2020-03-23 Thread JunMa via Phabricator via cfe-commits

This revision was not accepted when it landed; it landed in state "Needs 
Review".
This revision was automatically updated to reflect the committed changes.
Closed by commit rGd0f4af8f3088: [Coroutines] Insert lifetime intrinsics even 
O0 is used (authored by junparser).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76119/new/

https://reviews.llvm.org/D76119

Files:
  clang/lib/CodeGen/BackendUtil.cpp
  clang/test/CodeGenCoroutines/coro-always-inline.cpp


Index: clang/test/CodeGenCoroutines/coro-always-inline.cpp
===
--- /dev/null
+++ clang/test/CodeGenCoroutines/coro-always-inline.cpp
@@ -0,0 +1,54 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -fcoroutines-ts 
\
+// RUN:   -fexperimental-new-pass-manager -O0 %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -fcoroutines-ts 
\
+// RUN:   -fexperimental-new-pass-manager -fno-inline -O0 %s -o - | FileCheck 
%s
+
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -fcoroutines-ts 
\
+// RUN:   -O0 %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -fcoroutines-ts 
\
+// RUN:   -fno-inline -O0 %s -o - | FileCheck %s
+
+namespace std {
+namespace experimental {
+
+struct handle {};
+
+struct awaitable {
+  bool await_ready() { return true; }
+  // CHECK-NOT: await_suspend
+  inline void __attribute__((__always_inline__)) await_suspend(handle) {}
+  bool await_resume() { return true; }
+};
+
+template 
+struct coroutine_handle {
+  static handle from_address(void *address) { return {}; }
+};
+
+template 
+struct coroutine_traits {
+  struct promise_type {
+awaitable initial_suspend() { return {}; }
+awaitable final_suspend() { return {}; }
+void return_void() {}
+T get_return_object() { return T(); }
+void unhandled_exception() {}
+  };
+};
+} // namespace experimental
+} // namespace std
+
+// CHECK-LABEL: @_Z3foov
+// CHECK-LABEL: entry:
+// CHECK-NEXT: %this.addr.i{{[0-9]*}} = alloca 
%"struct.std::experimental::awaitable"*, align 8
+// CHECK-NEXT: %this.addr.i{{[0-9]*}} = alloca 
%"struct.std::experimental::awaitable"*, align 8
+// CHECK: [[CAST0:%[0-9]+]] = bitcast %"struct.std::experimental::awaitable"** 
%this.addr.i{{[0-9]*}} to i8*
+// CHECK-NEXT: call void @llvm.lifetime.start.p0i8(i64 8, i8* [[CAST0]])
+// CHECK: [[CAST1:%[0-9]+]] = bitcast %"struct.std::experimental::awaitable"** 
%this.addr.i{{[0-9]*}} to i8*
+// CHECK-NEXT: call void @llvm.lifetime.end.p0i8(i64 8, i8* [[CAST1]])
+
+// CHECK: [[CAST2:%[0-9]+]] = bitcast %"struct.std::experimental::awaitable"** 
%this.addr.i{{[0-9]*}} to i8*
+// CHECK-NEXT: call void @llvm.lifetime.start.p0i8(i64 8, i8* [[CAST2]])
+// CHECK: [[CAST3:%[0-9]+]] = bitcast %"struct.std::experimental::awaitable"** 
%this.addr.i{{[0-9]*}} to i8*
+// CHECK-NEXT: call void @llvm.lifetime.end.p0i8(i64 8, i8* [[CAST3]])
+void foo() { co_return; }
Index: clang/lib/CodeGen/BackendUtil.cpp
===
--- clang/lib/CodeGen/BackendUtil.cpp
+++ clang/lib/CodeGen/BackendUtil.cpp
@@ -579,8 +579,9 @@
   // At O0 and O1 we only run the always inliner which is more efficient. At
   // higher optimization levels we run the normal inliner.
   if (CodeGenOpts.OptimizationLevel <= 1) {
-bool InsertLifetimeIntrinsics = (CodeGenOpts.OptimizationLevel != 0 &&
- !CodeGenOpts.DisableLifetimeMarkers);
+bool InsertLifetimeIntrinsics = ((CodeGenOpts.OptimizationLevel != 0 &&
+  !CodeGenOpts.DisableLifetimeMarkers) ||
+ LangOpts.Coroutines);
 PMBuilder.Inliner = 
createAlwaysInlinerLegacyPass(InsertLifetimeIntrinsics);
   } else {
 // We do not want to inline hot callsites for SamplePGO module-summary 
build
@@ -1176,7 +1177,10 @@
   // which is just that always inlining occurs. Further, disable generating
   // lifetime intrinsics to avoid enabling further optimizations during
   // code generation.
-  MPM.addPass(AlwaysInlinerPass(/*InsertLifetimeIntrinsics=*/false));
+  // However, we need to insert lifetime intrinsics to avoid invalid access
+  // caused by multithreaded coroutines.
+  MPM.addPass(
+  AlwaysInlinerPass(/*InsertLifetimeIntrinsics=*/LangOpts.Coroutines));
 
   // At -O0, we can still do PGO. Add all the requested passes for
   // instrumentation PGO, if requested.


Index: clang/test/CodeGenCoroutines/coro-always-inline.cpp
===
--- /dev/null
+++ clang/test/CodeGenCoroutines/coro-always-inline.cpp
@@ -0,0 +1,54 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -fcoroutines-ts \
+// RUN:   -fexperimental-new-pass-manager -O0 %s

[PATCH] D70555: [coroutines] Don't build promise init with no args

2020-03-29 Thread JunMa via Phabricator via cfe-commits

junparser accepted this revision.
junparser added a comment.
This revision is now accepted and ready to land.

I'm not sure whether I can approve this.  If not, @modocache  please ignore 
this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70555/new/

https://reviews.llvm.org/D70555



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D69022: [coroutines] Remove assert on CoroutineParameterMoves in Sema::buildCoroutineParameterMoves

2019-11-21 Thread JunMa via Phabricator via cfe-commits

junparser updated this revision to Diff 230581.
junparser added a comment.

update as @modocache's suggestion


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69022/new/

https://reviews.llvm.org/D69022

Files:
  clang/lib/Sema/SemaCoroutine.cpp
  clang/test/SemaCXX/coroutines.cpp


Index: clang/test/SemaCXX/coroutines.cpp
===
--- clang/test/SemaCXX/coroutines.cpp
+++ clang/test/SemaCXX/coroutines.cpp
@@ -87,6 +87,11 @@
   co_await a;
 }
 
+int no_promise_type_multiple_awaits(int) { // expected-error {{this function 
cannot be a coroutine: 'std::experimental::coroutine_traits' has no 
member named 'promise_type'}}
+  co_await a;
+  co_await a;
+}
+
 template <>
 struct std::experimental::coroutine_traits { typedef int 
promise_type; };
 double bad_promise_type(double) { // expected-error {{this function cannot be 
a coroutine: 'experimental::coroutine_traits::promise_type' 
(aka 'int') is not a class}}
Index: clang/lib/Sema/SemaCoroutine.cpp
===
--- clang/lib/Sema/SemaCoroutine.cpp
+++ clang/lib/Sema/SemaCoroutine.cpp
@@ -1527,8 +1527,8 @@
   auto *FD = cast(CurContext);
 
   auto *ScopeInfo = getCurFunction();
-  assert(ScopeInfo->CoroutineParameterMoves.empty() &&
- "Should not build parameter moves twice");
+  if (!ScopeInfo->CoroutineParameterMoves.empty())
+return false;
 
   for (auto *PD : FD->parameters()) {
 if (PD->getType()->isDependentType())


Index: clang/test/SemaCXX/coroutines.cpp
===
--- clang/test/SemaCXX/coroutines.cpp
+++ clang/test/SemaCXX/coroutines.cpp
@@ -87,6 +87,11 @@
   co_await a;
 }
 
+int no_promise_type_multiple_awaits(int) { // expected-error {{this function cannot be a coroutine: 'std::experimental::coroutine_traits' has no member named 'promise_type'}}
+  co_await a;
+  co_await a;
+}
+
 template <>
 struct std::experimental::coroutine_traits { typedef int promise_type; };
 double bad_promise_type(double) { // expected-error {{this function cannot be a coroutine: 'experimental::coroutine_traits::promise_type' (aka 'int') is not a class}}
Index: clang/lib/Sema/SemaCoroutine.cpp
===
--- clang/lib/Sema/SemaCoroutine.cpp
+++ clang/lib/Sema/SemaCoroutine.cpp
@@ -1527,8 +1527,8 @@
   auto *FD = cast(CurContext);
 
   auto *ScopeInfo = getCurFunction();
-  assert(ScopeInfo->CoroutineParameterMoves.empty() &&
- "Should not build parameter moves twice");
+  if (!ScopeInfo->CoroutineParameterMoves.empty())
+return false;
 
   for (auto *PD : FD->parameters()) {
 if (PD->getType()->isDependentType())
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D69022: [coroutines] Remove assert on CoroutineParameterMoves in Sema::buildCoroutineParameterMoves

2019-11-21 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

In D69022#1755636 , @modocache wrote:

> Sorry for the slow response here, @junparser!
>
> The test case you came up with here is great! I can see the issue is that 
> `ScopeInfo->CoroutineParameterMoves` are built up when Clang parses the first 
> `co_await a`, but are not cleared when `lookupPromiseType` results in an 
> error. As a result, when Clang hits the second `co_await a`, it's in a state 
> that the current code didn't anticipate. Your test case does a great job 
> demonstrating this. Your fix for the problem also looks good to me! My only 
> suggestion is to make the test case just a little clearer, as I'll explain...
>
> (Also, in the future could you please upload your patches with full context? 
> You can read https://llvm.org/docs/Phabricator.html for more details. I think 
> the section explaining the web interface might be relevant to you, where it 
> suggests `git show HEAD -U99 > mypatch.patch`. The reason I ask is 
> because on Phabricator I can see what lines you're proposing should be added, 
> but not the surrounding source lines, which made this more difficult to 
> review.)


Thanks so much for reviewing the patch and giving the helpful advise.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69022/new/

https://reviews.llvm.org/D69022



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D69022: [coroutines] Remove assert on CoroutineParameterMoves in Sema::buildCoroutineParameterMoves

2019-11-22 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

In D69022#1756138 , @modocache wrote:

> LGTM, thanks! Please let me know if you'd like me to commit this change.


commit it please, thanks!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69022/new/

https://reviews.llvm.org/D69022



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D70579: [coroutines][PR41909] Generalize fix from D62550

2019-11-28 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

This also fix same ice when build cppcoro with current trunk. FYI


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70579/new/

https://reviews.llvm.org/D70579



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D71903: [Coroutines][6/6] Clang schedules new passes

2019-12-26 Thread JunMa via Phabricator via cfe-commits

junparser added inline comments.



Comment at: clang/lib/CodeGen/BackendUtil.cpp:1227
+  
MPM.addPass(createModuleToPostOrderCGSCCPassAdaptor(CoroSplitPass()));
+  MPM.addPass(createModuleToFunctionPassAdaptor(CoroElidePass()));
+  
MPM.addPass(createModuleToPostOrderCGSCCPassAdaptor(CoroSplitPass()));

Since coro elision depends on other optimization pass(inline and so on)  
implicitly,  how can we adjust the pipeline to achieve this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71903/new/

https://reviews.llvm.org/D71903



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D76118: [Coroutines] Do not evaluate InitListExpr of a co_return

2020-03-15 Thread JunMa via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rG53c2e10fb8a6: [Coroutines] Do not evaluate InitListExpr of a 
co_return (authored by junparser).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76118/new/

https://reviews.llvm.org/D76118

Files:
  clang/lib/CodeGen/CGCoroutine.cpp
  clang/test/CodeGenCoroutines/coro-return-voidtype-initlist.cpp

Index: clang/test/CodeGenCoroutines/coro-return-voidtype-initlist.cpp
===
--- /dev/null
+++ clang/test/CodeGenCoroutines/coro-return-voidtype-initlist.cpp
@@ -0,0 +1,81 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fcoroutines-ts -std=c++1z -emit-llvm %s -o - -disable-llvm-passes | FileCheck %s
+
+namespace std {
+template 
+struct b { b(int, a); };
+template 
+struct c {};
+namespace experimental {
+template 
+struct coroutine_traits : d {};
+template 
+struct coroutine_handle;
+template <>
+struct coroutine_handle<> {};
+template 
+struct coroutine_handle : coroutine_handle<> {
+  static coroutine_handle from_address(void *);
+};
+struct e {
+  int await_ready();
+  void await_suspend(coroutine_handle<>);
+  void await_resume();
+};
+} // namespace experimental
+} // namespace std
+template 
+auto ah(ag) { return ag().ah(0); }
+template 
+struct f;
+struct g {
+  struct h {
+int await_ready();
+template 
+void await_suspend(std::experimental::coroutine_handle);
+void await_resume();
+  };
+  std::experimental::e initial_suspend();
+  h final_suspend();
+  template 
+  auto await_transform(ag) { return ah(ag()); }
+};
+struct j : g {
+  f>> get_return_object();
+  void return_value(std::b>);
+  void unhandled_exception();
+};
+struct k {
+  k(std::experimental::coroutine_handle<>);
+  int await_ready();
+};
+template 
+struct f {
+  using promise_type = j;
+  std::experimental::coroutine_handle<> ar;
+  struct l : k {
+using at = k;
+l(std::experimental::coroutine_handle<> m) : at(m) {}
+void await_suspend(std::experimental::coroutine_handle<>);
+  };
+  struct n : l {
+n(std::experimental::coroutine_handle<> m) : l(m) {}
+am await_resume();
+  };
+  auto ah(int) { return n(ar); }
+};
+template 
+auto ax(std::c, aw) -> f>;
+template 
+struct J { static f>> bo(); };
+// CHECK-LABEL: _ZN1JIiE2boEv(
+template 
+f>> J::bo() {
+  std::c bu;
+  int bw(0);
+  // CHECK: void @_ZN1j12return_valueESt1bISt1cIiiEE(%struct.j* %__promise)
+  co_return{0, co_await ax(bu, bw)};
+}
+void bh() {
+  auto cn = [] { J::bo; };
+  cn();
+}
Index: clang/lib/CodeGen/CGCoroutine.cpp
===
--- clang/lib/CodeGen/CGCoroutine.cpp
+++ clang/lib/CodeGen/CGCoroutine.cpp
@@ -275,9 +275,9 @@
 void CodeGenFunction::EmitCoreturnStmt(CoreturnStmt const &S) {
   ++CurCoro.Data->CoreturnCount;
   const Expr *RV = S.getOperand();
-  if (RV && RV->getType()->isVoidType()) {
-// Make sure to evaluate the expression of a co_return with a void
-// expression for side effects.
+  if (RV && RV->getType()->isVoidType() && !isa(RV)) {
+// Make sure to evaluate the non initlist expression of a co_return
+// with a void expression for side effects.
 RunCleanupsScope cleanupScope(*this);
 EmitIgnoredExpr(RV);
   }
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D144169: [WebAssembly] Fix simd bit shift intrinsics codegen

2023-02-16 Thread JunMa via Phabricator via cfe-commits

junparser created this revision.
Herald added subscribers: pmatos, asb, ecnelises, sunfish, jgravelle-google, 
sbc100, dschuff.
Herald added a project: All.
junparser requested review of this revision.
Herald added subscribers: cfe-commits, aheejin.
Herald added a project: clang.

According to github.com/WebAssembly/simd/blob/main/proposals/simd/SIMD.md,
the shift count of bit shift instructions is taken modulo lane width.
This patch adds such operation.

Fixes PR#60655


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D144169

Files:
  clang/lib/Headers/wasm_simd128.h
  clang/test/Headers/wasm.c

Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -1584,11 +1584,12 @@
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
 // CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i8
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <16 x i8> undef, i8 [[TMP1]], i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP2]], <16 x i8> poison, <16 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP2:%.*]] = and i8 [[TMP1]], 7
+// CHECK-NEXT:[[TMP3:%.*]] = insertelement <16 x i8> undef, i8 [[TMP2]], i64 0
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP3]], <16 x i8> poison, <16 x i32> zeroinitializer
 // CHECK-NEXT:[[SHL_I:%.*]] = shl <16 x i8> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHL_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP3]]
+// CHECK-NEXT:[[TMP4:%.*]] = bitcast <16 x i8> [[SHL_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP4]]
 //
 v128_t test_i8x16_shl(v128_t a, uint32_t b) {
   return wasm_i8x16_shl(a, b);
@@ -1598,11 +1599,12 @@
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
 // CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i8
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <16 x i8> undef, i8 [[TMP1]], i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP2]], <16 x i8> poison, <16 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP2:%.*]] = and i8 [[TMP1]], 7
+// CHECK-NEXT:[[TMP3:%.*]] = insertelement <16 x i8> undef, i8 [[TMP2]], i64 0
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP3]], <16 x i8> poison, <16 x i32> zeroinitializer
 // CHECK-NEXT:[[SHR_I:%.*]] = ashr <16 x i8> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP3]]
+// CHECK-NEXT:[[TMP4:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP4]]
 //
 v128_t test_i8x16_shr(v128_t a, uint32_t b) {
   return wasm_i8x16_shr(a, b);
@@ -1612,11 +1614,12 @@
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
 // CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i8
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <16 x i8> undef, i8 [[TMP1]], i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP2]], <16 x i8> poison, <16 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP2:%.*]] = and i8 [[TMP1]], 7
+// CHECK-NEXT:[[TMP3:%.*]] = insertelement <16 x i8> undef, i8 [[TMP2]], i64 0
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP3]], <16 x i8> poison, <16 x i32> zeroinitializer
 // CHECK-NEXT:[[SHR_I:%.*]] = lshr <16 x i8> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP3]]
+// CHECK-NEXT:[[TMP4:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP4]]
 //
 v128_t test_u8x16_shr(v128_t a, uint32_t b) {
   return wasm_u8x16_shr(a, b);
@@ -1801,11 +1804,12 @@
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <8 x i16>
 // CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i16
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <8 x i16> undef, i16 [[TMP1]], i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <8 x i16> [[TMP2]], <8 x i16> poison, <8 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP2:%.*]] = and i16 [[TMP1]], 15
+// CHECK-NEXT:[[TMP3:%.*]] = insertelement <8 x i16> undef, i16 [[TMP2]], i64 0
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <8 x i16> [[TMP3]], <8 x i16> poison, <8 x i32> zeroinitializer
 // CHECK-NEXT:[[SHL_I:%.*]] = shl <8 x i16> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP3:%.*]] = bitcast <8 x i16> [[SHL_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP3]]
+// CHECK-NEXT:[[TMP4:%.*]] = bitcast <8 x i16> [[SHL_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP4]]
 //
 v128_t test_i16x8_shl(v128_t a, uint32_t b) {
   return wasm_i16x8_shl(a, b);
@@ -1815,11 +1819,12 @@
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4

[PATCH] D144169: [WebAssembly] Fix simd bit shift intrinsics codegen

2023-02-16 Thread JunMa via Phabricator via cfe-commits

junparser updated this revision to Diff 497935.
junparser added a comment.

Replace rem with bitmask operation


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144169/new/

https://reviews.llvm.org/D144169

Files:
  clang/lib/Headers/wasm_simd128.h
  clang/test/Headers/wasm.c

Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -1584,11 +1584,12 @@
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
 // CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i8
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <16 x i8> undef, i8 [[TMP1]], i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP2]], <16 x i8> poison, <16 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP2:%.*]] = and i8 [[TMP1]], 7
+// CHECK-NEXT:[[TMP3:%.*]] = insertelement <16 x i8> undef, i8 [[TMP2]], i64 0
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP3]], <16 x i8> poison, <16 x i32> zeroinitializer
 // CHECK-NEXT:[[SHL_I:%.*]] = shl <16 x i8> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHL_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP3]]
+// CHECK-NEXT:[[TMP4:%.*]] = bitcast <16 x i8> [[SHL_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP4]]
 //
 v128_t test_i8x16_shl(v128_t a, uint32_t b) {
   return wasm_i8x16_shl(a, b);
@@ -1598,11 +1599,12 @@
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
 // CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i8
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <16 x i8> undef, i8 [[TMP1]], i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP2]], <16 x i8> poison, <16 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP2:%.*]] = and i8 [[TMP1]], 7
+// CHECK-NEXT:[[TMP3:%.*]] = insertelement <16 x i8> undef, i8 [[TMP2]], i64 0
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP3]], <16 x i8> poison, <16 x i32> zeroinitializer
 // CHECK-NEXT:[[SHR_I:%.*]] = ashr <16 x i8> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP3]]
+// CHECK-NEXT:[[TMP4:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP4]]
 //
 v128_t test_i8x16_shr(v128_t a, uint32_t b) {
   return wasm_i8x16_shr(a, b);
@@ -1612,11 +1614,12 @@
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
 // CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i8
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <16 x i8> undef, i8 [[TMP1]], i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP2]], <16 x i8> poison, <16 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP2:%.*]] = and i8 [[TMP1]], 7
+// CHECK-NEXT:[[TMP3:%.*]] = insertelement <16 x i8> undef, i8 [[TMP2]], i64 0
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP3]], <16 x i8> poison, <16 x i32> zeroinitializer
 // CHECK-NEXT:[[SHR_I:%.*]] = lshr <16 x i8> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP3]]
+// CHECK-NEXT:[[TMP4:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP4]]
 //
 v128_t test_u8x16_shr(v128_t a, uint32_t b) {
   return wasm_u8x16_shr(a, b);
@@ -1801,11 +1804,12 @@
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <8 x i16>
 // CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i16
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <8 x i16> undef, i16 [[TMP1]], i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <8 x i16> [[TMP2]], <8 x i16> poison, <8 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP2:%.*]] = and i16 [[TMP1]], 15
+// CHECK-NEXT:[[TMP3:%.*]] = insertelement <8 x i16> undef, i16 [[TMP2]], i64 0
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <8 x i16> [[TMP3]], <8 x i16> poison, <8 x i32> zeroinitializer
 // CHECK-NEXT:[[SHL_I:%.*]] = shl <8 x i16> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP3:%.*]] = bitcast <8 x i16> [[SHL_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP3]]
+// CHECK-NEXT:[[TMP4:%.*]] = bitcast <8 x i16> [[SHL_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP4]]
 //
 v128_t test_i16x8_shl(v128_t a, uint32_t b) {
   return wasm_i16x8_shl(a, b);
@@ -1815,11 +1819,12 @@
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <8 x i16>
 // CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i16
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <8 x i16> undef, i16 [[TMP1]], i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <8 x i16> [[TMP2]], <8 x i16> poison, <8 x i32> zeroinitializer
+//

[PATCH] D144169: [WebAssembly] Fix simd bit shift intrinsics codegen

2023-02-16 Thread JunMa via Phabricator via cfe-commits

This revision was automatically updated to reflect the committed changes.
Closed by commit rGf253bb640d97: [WebAssembly] Fix simd bit shift intrinsics 
codegen (authored by junparser).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144169/new/

https://reviews.llvm.org/D144169

Files:
  clang/lib/Headers/wasm_simd128.h
  clang/test/Headers/wasm.c

Index: clang/test/Headers/wasm.c
===
--- clang/test/Headers/wasm.c
+++ clang/test/Headers/wasm.c
@@ -1584,11 +1584,12 @@
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
 // CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i8
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <16 x i8> undef, i8 [[TMP1]], i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP2]], <16 x i8> poison, <16 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP2:%.*]] = and i8 [[TMP1]], 7
+// CHECK-NEXT:[[TMP3:%.*]] = insertelement <16 x i8> undef, i8 [[TMP2]], i64 0
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP3]], <16 x i8> poison, <16 x i32> zeroinitializer
 // CHECK-NEXT:[[SHL_I:%.*]] = shl <16 x i8> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHL_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP3]]
+// CHECK-NEXT:[[TMP4:%.*]] = bitcast <16 x i8> [[SHL_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP4]]
 //
 v128_t test_i8x16_shl(v128_t a, uint32_t b) {
   return wasm_i8x16_shl(a, b);
@@ -1598,11 +1599,12 @@
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
 // CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i8
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <16 x i8> undef, i8 [[TMP1]], i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP2]], <16 x i8> poison, <16 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP2:%.*]] = and i8 [[TMP1]], 7
+// CHECK-NEXT:[[TMP3:%.*]] = insertelement <16 x i8> undef, i8 [[TMP2]], i64 0
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP3]], <16 x i8> poison, <16 x i32> zeroinitializer
 // CHECK-NEXT:[[SHR_I:%.*]] = ashr <16 x i8> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP3]]
+// CHECK-NEXT:[[TMP4:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP4]]
 //
 v128_t test_i8x16_shr(v128_t a, uint32_t b) {
   return wasm_i8x16_shr(a, b);
@@ -1612,11 +1614,12 @@
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <16 x i8>
 // CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i8
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <16 x i8> undef, i8 [[TMP1]], i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP2]], <16 x i8> poison, <16 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP2:%.*]] = and i8 [[TMP1]], 7
+// CHECK-NEXT:[[TMP3:%.*]] = insertelement <16 x i8> undef, i8 [[TMP2]], i64 0
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <16 x i8> [[TMP3]], <16 x i8> poison, <16 x i32> zeroinitializer
 // CHECK-NEXT:[[SHR_I:%.*]] = lshr <16 x i8> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP3:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP3]]
+// CHECK-NEXT:[[TMP4:%.*]] = bitcast <16 x i8> [[SHR_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP4]]
 //
 v128_t test_u8x16_shr(v128_t a, uint32_t b) {
   return wasm_u8x16_shr(a, b);
@@ -1801,11 +1804,12 @@
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <8 x i16>
 // CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i16
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <8 x i16> undef, i16 [[TMP1]], i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <8 x i16> [[TMP2]], <8 x i16> poison, <8 x i32> zeroinitializer
+// CHECK-NEXT:[[TMP2:%.*]] = and i16 [[TMP1]], 15
+// CHECK-NEXT:[[TMP3:%.*]] = insertelement <8 x i16> undef, i16 [[TMP2]], i64 0
+// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflevector <8 x i16> [[TMP3]], <8 x i16> poison, <8 x i32> zeroinitializer
 // CHECK-NEXT:[[SHL_I:%.*]] = shl <8 x i16> [[TMP0]], [[SH_PROM_I]]
-// CHECK-NEXT:[[TMP3:%.*]] = bitcast <8 x i16> [[SHL_I]] to <4 x i32>
-// CHECK-NEXT:ret <4 x i32> [[TMP3]]
+// CHECK-NEXT:[[TMP4:%.*]] = bitcast <8 x i16> [[SHL_I]] to <4 x i32>
+// CHECK-NEXT:ret <4 x i32> [[TMP4]]
 //
 v128_t test_i16x8_shl(v128_t a, uint32_t b) {
   return wasm_i16x8_shl(a, b);
@@ -1815,11 +1819,12 @@
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:[[TMP0:%.*]] = bitcast <4 x i32> [[A:%.*]] to <8 x i16>
 // CHECK-NEXT:[[TMP1:%.*]] = trunc i32 [[B:%.*]] to i16
-// CHECK-NEXT:[[TMP2:%.*]] = insertelement <8 x i16> undef, i16 [[TMP1]], i64 0
-// CHECK-NEXT:[[SH_PROM_I:%.*]] = shufflev

[PATCH] D98638: [RFC][Coroutine] Force stack allocation after await_suspend() call

2021-03-17 Thread JunMa via Phabricator via cfe-commits

junparser added a comment.

In D98638#2630786 , @lxfind wrote:

> Well, I guess another potential solution is to force emitting lifetime 
> intrinsics for this part of coroutine in the front-end.
> Like this:
>
>   diff --git a/clang/lib/CodeGen/CGDecl.cpp b/clang/lib/CodeGen/CGDecl.cpp
>   index 243d93a8c165..ef76e8dcb7c9 100644
>   --- a/clang/lib/CodeGen/CGDecl.cpp
>   +++ b/clang/lib/CodeGen/CGDecl.cpp
>   @@ -1317,7 +1317,7 @@ void CodeGenFunction::EmitAutoVarDecl(const VarDecl 
> &D) {
>/// otherwise
>llvm::Value *CodeGenFunction::EmitLifetimeStart(uint64_t Size,
>llvm::Value *Addr) {
>   -  if (!ShouldEmitLifetimeMarkers)
>   +  if (!ShouldEmitLifetimeMarkers && !isCoroutine())
>return nullptr;
>
>  assert(Addr->getType()->getPointerAddressSpace() ==
>   diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
>   index 18f1468dcb86..2e6e6808db7f 100644
>   --- a/clang/lib/CodeGen/CGExpr.cpp
>   +++ b/clang/lib/CodeGen/CGExpr.cpp
>   @@ -535,7 +535,7 @@ EmitMaterializeTemporaryExpr(const 
> MaterializeTemporaryExpr *M) {
>  break;
>
>case SD_FullExpression: {
>   -  if (!ShouldEmitLifetimeMarkers)
>   +  if (!ShouldEmitLifetimeMarkers && !isCoroutine())
>break;
>
>  // Avoid creating a conditional cleanup just to hold an 
> llvm.lifetime.end

We have already allowed to emit lifetime intrinsics for always inlined function 
under O2 , so IMOO emitting 
lifetime intrinsics for coroutine function is OK since stack coloring has less 
effect on coroutine function.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98638/new/

https://reviews.llvm.org/D98638

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

82 matches

Mail list logo