[Lldb-commits] [clang-tools-extra] [flang] [compiler-rt] [lldb] [libc] [clang] [libcxx] [libunwind] [llvm] [lld] [RISCV][MC] Add experimental support of Zaamo and Zalrsc (PR #78970)

2024-01-25 Thread Wang Pengcheng via lldb-commits

https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/78970

>From 8cc71cb7ddb2e6691d31138ae2ef683a0690e171 Mon Sep 17 00:00:00 2001
From: wangpc 
Date: Mon, 22 Jan 2024 21:11:42 +0800
Subject: [PATCH 1/7] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.4
---
 clang/lib/Basic/Targets/RISCV.cpp |   2 +-
 .../test/Preprocessor/riscv-target-features.c |  18 ++
 llvm/docs/RISCVUsage.rst  |   2 +
 llvm/lib/Support/RISCVISAInfo.cpp |   2 +
 llvm/lib/Target/RISCV/RISCVFeatures.td|  26 ++-
 llvm/test/CodeGen/RISCV/attributes.ll |   8 +
 llvm/test/MC/RISCV/rv32i-invalid.s|   2 +-
 llvm/test/MC/RISCV/rv32zaamo-invalid.s|  11 ++
 llvm/test/MC/RISCV/rv32zaamo-valid.s  | 122 ++
 llvm/test/MC/RISCV/rv32zalrsc-invalid.s   |   7 +
 llvm/test/MC/RISCV/rv32zalrsc-valid.s |  36 
 llvm/test/MC/RISCV/rv64zaamo-invalid.s|  11 ++
 llvm/test/MC/RISCV/rv64zaamo-valid.s  | 157 ++
 llvm/test/MC/RISCV/rv64zalrsc-invalid.s   |   7 +
 llvm/test/MC/RISCV/rv64zalrsc-valid.s |  42 +
 llvm/unittests/Support/RISCVISAInfoTest.cpp   |   2 +
 16 files changed, 452 insertions(+), 3 deletions(-)
 create mode 100644 llvm/test/MC/RISCV/rv32zaamo-invalid.s
 create mode 100644 llvm/test/MC/RISCV/rv32zaamo-valid.s
 create mode 100644 llvm/test/MC/RISCV/rv32zalrsc-invalid.s
 create mode 100644 llvm/test/MC/RISCV/rv32zalrsc-valid.s
 create mode 100644 llvm/test/MC/RISCV/rv64zaamo-invalid.s
 create mode 100644 llvm/test/MC/RISCV/rv64zaamo-valid.s
 create mode 100644 llvm/test/MC/RISCV/rv64zalrsc-invalid.s
 create mode 100644 llvm/test/MC/RISCV/rv64zalrsc-valid.s

diff --git a/clang/lib/Basic/Targets/RISCV.cpp 
b/clang/lib/Basic/Targets/RISCV.cpp
index c71b2e9eeb6c172..9af9bdd1d74e9dd 100644
--- a/clang/lib/Basic/Targets/RISCV.cpp
+++ b/clang/lib/Basic/Targets/RISCV.cpp
@@ -175,7 +175,7 @@ void RISCVTargetInfo::getTargetDefines(const LangOptions 
&Opts,
 Builder.defineMacro("__riscv_muldiv");
   }
 
-  if (ISAInfo->hasExtension("a")) {
+  if (ISAInfo->hasExtension("a") || ISAInfo->hasExtension("zaamo")) {
 Builder.defineMacro("__riscv_atomic");
 Builder.defineMacro("__GCC_HAVE_SYNC_COMPARE_AND_SWAP_1");
 Builder.defineMacro("__GCC_HAVE_SYNC_COMPARE_AND_SWAP_2");
diff --git a/clang/test/Preprocessor/riscv-target-features.c 
b/clang/test/Preprocessor/riscv-target-features.c
index 5fde5ccdbeacfb0..38473d07004a574 100644
--- a/clang/test/Preprocessor/riscv-target-features.c
+++ b/clang/test/Preprocessor/riscv-target-features.c
@@ -141,7 +141,9 @@
 
 // Experimental extensions
 
+// CHECK-NOT: __riscv_zaamo {{.*$}}
 // CHECK-NOT: __riscv_zacas {{.*$}}
+// CHECK-NOT: __riscv_zalrsc {{.*$}}
 // CHECK-NOT: __riscv_zcmop {{.*$}}
 // CHECK-NOT: __riscv_zfbfmin {{.*$}}
 // CHECK-NOT: __riscv_zicfilp {{.*$}}
@@ -1307,6 +1309,14 @@
 // CHECK-ZVKT-EXT: __riscv_zvkt 100{{$}}
 
 // Experimental extensions
+// RUN: %clang --target=riscv32 -menable-experimental-extensions \
+// RUN: -march=rv32i_zaamo0p1 -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-ZAAMO-EXT %s
+// RUN: %clang --target=riscv64 -menable-experimental-extensions \
+// RUN: -march=rv64i_zaamo0p1 -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-ZAAMO-EXT %s
+// CHECK-ZAAMO-EXT: __riscv_atomic 1
+// CHECK-ZAAMO-EXT: __riscv_zaamo 1000{{$}}
 
 // RUN: %clang --target=riscv32 -menable-experimental-extensions \
 // RUN: -march=rv32i_zacas1p0 -x c -E -dM %s \
@@ -1316,6 +1326,14 @@
 // RUN: -o - | FileCheck --check-prefix=CHECK-ZACAS-EXT %s
 // CHECK-ZACAS-EXT: __riscv_zacas 100{{$}}
 
+// RUN: %clang --target=riscv32 -menable-experimental-extensions \
+// RUN: -march=rv32i_zalrsc0p1 -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-ZALRSC-EXT %s
+// RUN: %clang --target=riscv64 -menable-experimental-extensions \
+// RUN: -march=rv64i_zalrsc0p1 -x c -E -dM %s \
+// RUN: -o - | FileCheck --check-prefix=CHECK-ZALRSC-EXT %s
+// CHECK-ZALRSC-EXT: __riscv_zalrsc 1000{{$}}
+
 // RUN: %clang --target=riscv32 -menable-experimental-extensions \
 // RUN: -march=rv32izfbfmin1p0 -x c -E -dM %s \
 // RUN: -o - | FileCheck --check-prefix=CHECK-ZFBFMIN-EXT %s
diff --git a/llvm/docs/RISCVUsage.rst b/llvm/docs/RISCVUsage.rst
index 6fdc945ad27078e..005e9f1d7324445 100644
--- a/llvm/docs/RISCVUsage.rst
+++ b/llvm/docs/RISCVUsage.rst
@@ -100,6 +100,8 @@ on support follow.
  ``V``Supported
  ``Za128rs``  Supported (`See note 
<#riscv-profiles-extensions-note>`__)
  ``Za64rs``   Supported (`See note 
<#riscv-profiles-extensions-note>`__)
+ ``Zaamo``Supported
+ ``Zalrsc``   Supported
  ``Zawrs``Assembly Support
  ``Zba``  Supported
  ``Zbb``  Sup

[Lldb-commits] [flang] [lld] [libc] [clang-tools-extra] [clang] [libcxx] [libunwind] [llvm] [lldb] [compiler-rt] [RISCV][MC] Add experimental support of Zaamo and Zalrsc (PR #78970)

2024-01-25 Thread Wang Pengcheng via lldb-commits

https://github.com/wangpc-pp closed 
https://github.com/llvm/llvm-project/pull/78970
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libcxx] [lldb] [clang-tools-extra] [libc] [compiler-rt] [llvm] [clang] [flang] [lld] [Clang][C++23] Implement P2448R2: Relaxing some constexpr restrictions (PR #77753)

2024-01-25 Thread Mariya Podchishchaeva via lldb-commits

Fznamznon wrote:

> So I guess we should set the DefaultedDestructorIsConstexpr to false and only 
> use it for warning?

I'm not sure? Switching all constexpr-related errors to warnings doesn't seem 
right, even though almost all functions now can be marked constexpr, they still 
can't be called from constexpr context. In addition, if we keep 
DefaultedDestructorIsConstexpr always `false` we still will produce _some_ 
diagnostic for code in 
https://github.com/llvm/llvm-project/pull/78195#issuecomment-1895950521 and we 
should not, I suppose?
Sorry if I'm misunderstanding and being slow.

https://github.com/llvm/llvm-project/pull/77753
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [flang] [lld] [libc] [clang-tools-extra] [libclc] [clang] [libcxx] [libcxxabi] [llvm] [lldb] [compiler-rt] [VPlan] Add new VPUniformPerUFRecipe, use for step truncation. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/78113

>From 36b085f21b76d7bf7c9965a86a09d1cef4fe9329 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Sun, 14 Jan 2024 14:13:08 +
Subject: [PATCH 1/5] [VPlan] Add new VPUniformPerUFRecipe, use for step
 truncation.

Add a new recipe to model uniform-per-UF instructions, without relying
on an underlying instruction. Initially, it supports uniform cast-ops
and is therefore storing the result type.

Not relying on an underlying instruction (like the current
VPReplicateRecipe) allows to create instances without a corresponding
instruction.

In the future, to plan is to extend this recipe to handle all opcodes
needed to replace the uniform part of VPReplicateRecipe.
---
 llvm/lib/Transforms/Vectorize/VPlan.h | 30 
 .../Transforms/Vectorize/VPlanAnalysis.cpp|  6 ++-
 .../lib/Transforms/Vectorize/VPlanRecipes.cpp | 49 ---
 .../Transforms/Vectorize/VPlanTransforms.cpp  |  9 
 llvm/lib/Transforms/Vectorize/VPlanValue.h|  1 +
 .../LoopVectorize/cast-induction.ll   |  4 +-
 .../interleave-and-scalarize-only.ll  |  3 +-
 .../pr46525-expander-insertpoint.ll   |  2 +-
 8 files changed, 93 insertions(+), 11 deletions(-)

diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h 
b/llvm/lib/Transforms/Vectorize/VPlan.h
index 4b4f4911eb6415..d598522448 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -1945,6 +1945,36 @@ class VPReplicateRecipe : public VPRecipeWithIRFlags, 
public VPValue {
   }
 };
 
+/// VPUniformPerUFRecipe represents an instruction with Opcode that is uniform
+/// per UF, i.e. it generates a single scalar instance per UF.
+/// TODO: at the moment, only Cast opcodes are supported, extend to support
+///   missing opcodes to replace uniform part of VPReplicateRecipe.
+class VPUniformPerUFRecipe : public VPRecipeBase, public VPValue {
+  unsigned Opcode;
+
+  /// Result type for the cast.
+  Type *ResultTy;
+
+  Value *generate(VPTransformState &State, unsigned Part);
+
+public:
+  VPUniformPerUFRecipe(Instruction::CastOps Opcode, VPValue *Op, Type 
*ResultTy)
+  : VPRecipeBase(VPDef::VPUniformPerUFSC, {Op}), VPValue(this),
+Opcode(Opcode), ResultTy(ResultTy) {}
+
+  ~VPUniformPerUFRecipe() override = default;
+
+  VP_CLASSOF_IMPL(VPDef::VPWidenIntOrFpInductionSC)
+
+  void execute(VPTransformState &State) override;
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+  /// Print the recipe.
+  void print(raw_ostream &O, const Twine &Indent,
+ VPSlotTracker &SlotTracker) const override;
+#endif
+};
+
 /// A recipe for generating conditional branches on the bits of a mask.
 class VPBranchOnMaskRecipe : public VPRecipeBase {
 public:
diff --git a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp 
b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
index 97a8a1803bbf5a..d71b0703994450 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
@@ -230,7 +230,11 @@ Type *VPTypeAnalysis::inferScalarType(const VPValue *V) {
 return V->getUnderlyingValue()->getType();
   })
   .Case(
-  [](const VPWidenCastRecipe *R) { return R->getResultType(); });
+  [](const VPWidenCastRecipe *R) { return R->getResultType(); })
+  .Case([](const VPExpandSCEVRecipe *R) {
+return R->getSCEV()->getType();
+  });
+
   assert(ResultTy && "could not infer type for the given VPValue");
   CachedTypes[V] = ResultTy;
   return ResultTy;
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp 
b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 1f844bce23102e..423504e8f7e05e 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -164,6 +164,8 @@ bool VPRecipeBase::mayHaveSideEffects() const {
 auto *R = cast(this);
 return R->getUnderlyingInstr()->mayHaveSideEffects();
   }
+  case VPUniformPerUFSC:
+return false;
   default:
 return true;
   }
@@ -1117,13 +1119,7 @@ void VPScalarIVStepsRecipe::execute(VPTransformState 
&State) {
 
   // Ensure step has the same type as that of scalar IV.
   Type *BaseIVTy = BaseIV->getType()->getScalarType();
-  if (BaseIVTy != Step->getType()) {
-// TODO: Also use VPDerivedIVRecipe when only the step needs truncating, to
-// avoid separate truncate here.
-assert(Step->getType()->isIntegerTy() &&
-   "Truncation requires an integer step");
-Step = State.Builder.CreateTrunc(Step, BaseIVTy);
-  }
+  assert(BaseIVTy == Step->getType());
 
   // We build scalar steps for both integer and floating-point induction
   // variables. Here, we determine the kind of arithmetic we will perform.
@@ -1469,6 +1465,45 @@ void VPReplicateRecipe::print(raw_ostream &O, const 
Twine &Indent,
 }
 #endif
 
+Value *VPUniformPerUFRecipe ::generate(VPTransformState 

[Lldb-commits] [lldb] [clang-tools-extra] [compiler-rt] [libcxx] [libc] [lld] [clang] [llvm] [flang] [Clang][C++23] Implement P2448R2: Relaxing some constexpr restrictions (PR #77753)

2024-01-25 Thread via lldb-commits

cor3ntin wrote:

> So I guess we should set the DefaultedDestructorIsConstexpr to false and only 
> use it for warning?

Oh gosh, I'm an idiot, i meant **`true`** 


https://github.com/llvm/llvm-project/pull/77753
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libc] [compiler-rt] [libclc] [lldb] [clang-tools-extra] [flang] [libcxx] [libcxxabi] [llvm] [lld] [clang] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libcxx] [libcxxabi] [clang] [flang] [lldb] [llvm] [libclc] [clang-tools-extra] [compiler-rt] [libc] [lld] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libcxx] [libcxxabi] [clang] [flang] [lldb] [llvm] [libclc] [clang-tools-extra] [compiler-rt] [libc] [lld] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/78113

>From 36b085f21b76d7bf7c9965a86a09d1cef4fe9329 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Sun, 14 Jan 2024 14:13:08 +
Subject: [PATCH 1/6] [VPlan] Add new VPUniformPerUFRecipe, use for step
 truncation.

Add a new recipe to model uniform-per-UF instructions, without relying
on an underlying instruction. Initially, it supports uniform cast-ops
and is therefore storing the result type.

Not relying on an underlying instruction (like the current
VPReplicateRecipe) allows to create instances without a corresponding
instruction.

In the future, to plan is to extend this recipe to handle all opcodes
needed to replace the uniform part of VPReplicateRecipe.
---
 llvm/lib/Transforms/Vectorize/VPlan.h | 30 
 .../Transforms/Vectorize/VPlanAnalysis.cpp|  6 ++-
 .../lib/Transforms/Vectorize/VPlanRecipes.cpp | 49 ---
 .../Transforms/Vectorize/VPlanTransforms.cpp  |  9 
 llvm/lib/Transforms/Vectorize/VPlanValue.h|  1 +
 .../LoopVectorize/cast-induction.ll   |  4 +-
 .../interleave-and-scalarize-only.ll  |  3 +-
 .../pr46525-expander-insertpoint.ll   |  2 +-
 8 files changed, 93 insertions(+), 11 deletions(-)

diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h 
b/llvm/lib/Transforms/Vectorize/VPlan.h
index 4b4f4911eb6415e..d5985224488 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -1945,6 +1945,36 @@ class VPReplicateRecipe : public VPRecipeWithIRFlags, 
public VPValue {
   }
 };
 
+/// VPUniformPerUFRecipe represents an instruction with Opcode that is uniform
+/// per UF, i.e. it generates a single scalar instance per UF.
+/// TODO: at the moment, only Cast opcodes are supported, extend to support
+///   missing opcodes to replace uniform part of VPReplicateRecipe.
+class VPUniformPerUFRecipe : public VPRecipeBase, public VPValue {
+  unsigned Opcode;
+
+  /// Result type for the cast.
+  Type *ResultTy;
+
+  Value *generate(VPTransformState &State, unsigned Part);
+
+public:
+  VPUniformPerUFRecipe(Instruction::CastOps Opcode, VPValue *Op, Type 
*ResultTy)
+  : VPRecipeBase(VPDef::VPUniformPerUFSC, {Op}), VPValue(this),
+Opcode(Opcode), ResultTy(ResultTy) {}
+
+  ~VPUniformPerUFRecipe() override = default;
+
+  VP_CLASSOF_IMPL(VPDef::VPWidenIntOrFpInductionSC)
+
+  void execute(VPTransformState &State) override;
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+  /// Print the recipe.
+  void print(raw_ostream &O, const Twine &Indent,
+ VPSlotTracker &SlotTracker) const override;
+#endif
+};
+
 /// A recipe for generating conditional branches on the bits of a mask.
 class VPBranchOnMaskRecipe : public VPRecipeBase {
 public:
diff --git a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp 
b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
index 97a8a1803bbf5a5..d71b07039944500 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
@@ -230,7 +230,11 @@ Type *VPTypeAnalysis::inferScalarType(const VPValue *V) {
 return V->getUnderlyingValue()->getType();
   })
   .Case(
-  [](const VPWidenCastRecipe *R) { return R->getResultType(); });
+  [](const VPWidenCastRecipe *R) { return R->getResultType(); })
+  .Case([](const VPExpandSCEVRecipe *R) {
+return R->getSCEV()->getType();
+  });
+
   assert(ResultTy && "could not infer type for the given VPValue");
   CachedTypes[V] = ResultTy;
   return ResultTy;
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp 
b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 1f844bce23102e2..423504e8f7e05e7 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -164,6 +164,8 @@ bool VPRecipeBase::mayHaveSideEffects() const {
 auto *R = cast(this);
 return R->getUnderlyingInstr()->mayHaveSideEffects();
   }
+  case VPUniformPerUFSC:
+return false;
   default:
 return true;
   }
@@ -1117,13 +1119,7 @@ void VPScalarIVStepsRecipe::execute(VPTransformState 
&State) {
 
   // Ensure step has the same type as that of scalar IV.
   Type *BaseIVTy = BaseIV->getType()->getScalarType();
-  if (BaseIVTy != Step->getType()) {
-// TODO: Also use VPDerivedIVRecipe when only the step needs truncating, to
-// avoid separate truncate here.
-assert(Step->getType()->isIntegerTy() &&
-   "Truncation requires an integer step");
-Step = State.Builder.CreateTrunc(Step, BaseIVTy);
-  }
+  assert(BaseIVTy == Step->getType());
 
   // We build scalar steps for both integer and floating-point induction
   // variables. Here, we determine the kind of arithmetic we will perform.
@@ -1469,6 +1465,45 @@ void VPReplicateRecipe::print(raw_ostream &O, const 
Twine &Indent,
 }
 #endif
 
+Value *VPUniformPerUFRecipe ::generate(VPTransform

[Lldb-commits] [libc] [compiler-rt] [libclc] [lldb] [clang-tools-extra] [flang] [libcxx] [libcxxabi] [llvm] [lld] [clang] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -1469,6 +1461,52 @@ void VPReplicateRecipe::print(raw_ostream &O, const 
Twine &Indent,
 }
 #endif
 
+static bool isUniformAcrossVFsAndUFs(VPScalarCastRecipe *C) {

fhahn wrote:

Added comment + TODO, thanks!

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [compiler-rt] [lldb] [flang] [libc] [libcxxabi] [llvm] [lld] [libcxx] [clang] [libclc] [clang-tools-extra] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -230,7 +230,11 @@ Type *VPTypeAnalysis::inferScalarType(const VPValue *V) {
 return V->getUnderlyingValue()->getType();
   })
   .Case(
-  [](const VPWidenCastRecipe *R) { return R->getResultType(); });
+  [](const VPWidenCastRecipe *R) { return R->getResultType(); })
+  .Case([](const VPExpandSCEVRecipe *R) {

fhahn wrote:

VPDerivedIVRecipe getScalarType would be good to remove, will do as follow-up.

`VPWidenIntOrFpInductionRecipe's` trunc drives truncating the start value and 
step, effectively creating a narrow phi. I'll check to see if this can also be 
modeled with the new recipe.

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libcxx] [libcxxabi] [clang] [flang] [lldb] [llvm] [libclc] [clang-tools-extra] [compiler-rt] [libc] [lld] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [llvm] [flang] [libc] [lld] [libclc] [libcxx] [lldb] [clang] [clang-tools-extra] [libcxxabi] [compiler-rt] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits

https://github.com/fhahn commented:

Comments should be addressed and title & description updated, thanks!

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libclc] [libcxx] [libcxxabi] [llvm] [clang-tools-extra] [flang] [libc] [compiler-rt] [lldb] [lld] [clang] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -1469,6 +1461,52 @@ void VPReplicateRecipe::print(raw_ostream &O, const 
Twine &Indent,
 }
 #endif
 
+static bool isUniformAcrossVFsAndUFs(VPScalarCastRecipe *C) {
+  return C->isDefinedOutsideVectorRegions() ||
+ isa(C->getOperand(0)) ||
+ isa(C->getOperand(0));
+}
+
+Value *VPScalarCastRecipe ::generate(VPTransformState &State, unsigned Part) {
+  assert(vputils::onlyFirstLaneUsed(this) &&
+ "Codegen only implemented for first lane.");
+  switch (Opcode) {
+  case Instruction::SExt:
+  case Instruction::ZExt:

fhahn wrote:

Yep, could remove if preferred, but `State.Builder.CreateCast` generically 
supports any cast

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libcxx] [lld] [lldb] [flang] [compiler-rt] [llvm] [libcxxabi] [libclc] [clang] [libc] [clang-tools-extra] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -498,10 +498,34 @@ static VPValue *createScalarIVSteps(VPlan &Plan, const 
InductionDescriptor &ID,
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
   Type *TruncTy = TruncI ? TruncI->getType() : IVTy;
   VPValue *BaseIV = CanonicalIV;

fhahn wrote:

Updated, thanks! This works well now with the refactoring in this patch. IVTy 
removed.

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libc] [flang] [libunwind] [libcxx] [clang-tools-extra] [compiler-rt] [lldb] [clang] [llvm] [mlir] [lld] Reland: [libc++][format] P2637R3: Member visit (std::basic_format_arg) #76449 (P

2024-01-25 Thread Hristo Hristov via lldb-commits

https://github.com/H-G-Hristov updated 
https://github.com/llvm/llvm-project/pull/79032

>From e03452fda84a5284420bba1913299b68caabb6cd Mon Sep 17 00:00:00 2001
From: Zingam 
Date: Mon, 22 Jan 2024 20:35:00 +0200
Subject: [PATCH 1/2] Revert "Revert "[libc++][format] P2637R3: Member `visit`
 (`std::basic_format_arg`) (#76449)""

This reverts commit 02f95b77515fe18ed1076b94cbb850ea0cf3c77e.
---
 libcxx/docs/ReleaseNotes/18.rst   |   1 +
 libcxx/docs/Status/Cxx2cPapers.csv|   2 +-
 libcxx/docs/Status/FormatIssues.csv   |   2 +-
 libcxx/include/__config   |   6 +
 libcxx/include/__format/format_arg.h  | 109 +-
 libcxx/include/__format/format_context.h  |  33 +-
 libcxx/include/format |   2 +-
 .../format.arg/visit.pass.cpp | 333 
 .../format.arg/visit.return_type.pass.cpp | 369 ++
 .../visit_format_arg.deprecated.verify.cpp|  38 ++
 .../format.arg/visit_format_arg.pass.cpp  |   6 +-
 .../format.arguments/format.args/get.pass.cpp |  48 ++-
 libcxx/test/support/test_basic_format_arg.h   |  20 +-
 libcxx/test/support/test_macros.h |   5 +
 .../generate_feature_test_macro_components.py |   1 +
 15 files changed, 927 insertions(+), 48 deletions(-)
 create mode 100644 
libcxx/test/std/utilities/format/format.arguments/format.arg/visit.pass.cpp
 create mode 100644 
libcxx/test/std/utilities/format/format.arguments/format.arg/visit.return_type.pass.cpp
 create mode 100644 
libcxx/test/std/utilities/format/format.arguments/format.arg/visit_format_arg.deprecated.verify.cpp

diff --git a/libcxx/docs/ReleaseNotes/18.rst b/libcxx/docs/ReleaseNotes/18.rst
index fd882bafe19a51..237a63022d55ff 100644
--- a/libcxx/docs/ReleaseNotes/18.rst
+++ b/libcxx/docs/ReleaseNotes/18.rst
@@ -79,6 +79,7 @@ Implemented Papers
 - P1759R6 - Native handles and file streams
 - P2868R3 - Remove Deprecated ``std::allocator`` Typedef From C++26
 - P2517R1 - Add a conditional ``noexcept`` specification to ``std::apply``
+- P2637R3 - Member ``visit``
 - P2447R6 - ``span`` over initializer list
 
 
diff --git a/libcxx/docs/Status/Cxx2cPapers.csv 
b/libcxx/docs/Status/Cxx2cPapers.csv
index f80b1f6b663f04..c45aa3c510072e 100644
--- a/libcxx/docs/Status/Cxx2cPapers.csv
+++ b/libcxx/docs/Status/Cxx2cPapers.csv
@@ -17,7 +17,7 @@
 "`P0792R14 `__","LWG","``function_ref``: a 
type-erased callable reference","Varna June 2023","","",""
 "`P2874R2 `__","LWG","Mandating Annex D Require No 
More","Varna June 2023","","",""
 "`P2757R3 `__","LWG","Type-checking format 
args","Varna June 2023","","","|format|"
-"`P2637R3 `__","LWG","Member ``visit``","Varna June 
2023","|Partial|","18.0",""
+"`P2637R3 `__","LWG","Member ``visit``","Varna June 
2023","|Complete|","18.0",""
 "`P2641R4 `__","CWG, LWG","Checking if a ``union`` 
alternative is active","Varna June 2023","","",""
 "`P1759R6 `__","LWG","Native handles and file 
streams","Varna June 2023","|Complete|","18.0",""
 "`P2697R1 `__","LWG","Interfacing ``bitset`` with 
``string_view``","Varna June 2023","|Complete|","18.0",""
diff --git a/libcxx/docs/Status/FormatIssues.csv 
b/libcxx/docs/Status/FormatIssues.csv
index 513988d08036ca..6e58e752191ea5 100644
--- a/libcxx/docs/Status/FormatIssues.csv
+++ b/libcxx/docs/Status/FormatIssues.csv
@@ -16,7 +16,7 @@ Number,Name,Standard,Assignee,Status,First released version
 "`P2693R1 `__","Formatting ``thread::id`` and 
``stacktrace``","C++23","Mark de Wever","|In Progress|"
 "`P2510R3 `__","Formatting pointers","C++26","Mark 
de Wever","|Complete|",17.0
 "`P2757R3 `__","Type-checking format 
args","C++26","","",
-"`P2637R3 `__","Member ``visit``","C++26","","",
+"`P2637R3 `__","Member ``visit``","C++26","Hristo 
Hristov","|Complete|",18.0
 "`P2905R2 `__","Runtime format strings","C++26 
DR","Mark de Wever","|Complete|",18.0
 "`P2918R2 `__","Runtime format strings 
II","C++26","Mark de Wever","|Complete|",18.0
 "`P2909R4 `__","Fix formatting of code units as 
integers (Dude, where’s my ``char``?)","C++26 DR","Mark de 
Wever","|Complete|",18.0
diff --git a/libcxx/include/__config b/libcxx/include/__config
index 9a64cdb489119d..00489d971c296c 100644
--- a/libcxx/include/__config
+++ b/libcxx/include/__config
@@ -995,6 +995,12 @@ typedef __char32_t char32_t;
 #define _LIBCPP_DEPRECATED_IN_CXX23
 #  endif
 
+#  if _LIBCPP_STD_VER >= 26
+#define _LIBCPP_DEPRECATED_IN_CXX26 _LIBCPP_DEPRECATED
+#  else
+#define _LIBCPP_DEPRECATED_IN_CXX26
+#  endif
+
 #  if !defined(_LIBCPP_HAS_NO_C

[Lldb-commits] [lldb] [lldb] Implement WebAssembly debugging (PR #77949)

2024-01-25 Thread Quentin Michaud via lldb-commits

mh4ck-Thales wrote:

I already tried to use `register read` to access Wasm variables without 
success. But it was the patch available 
[here](https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/build-scripts/lldb_wasm.patch)
 is part of WAMR, maybe this patch is different and will make it work. I tried 
using this patch without success.

It seems like compiling lldb with this patch and use it to debug Wasm with the 
latest version of WAMR / iwasm do not work correctly (at least for me). lldb 
can connect to the server embedded into iwasm but doesn't seem to be able to 
disassemble Wasm bytecode or set breakpoints. Just giving the info here because 
I'm not sure this patch is supposed to be straightaway compatible with WAMR. 

https://github.com/llvm/llvm-project/pull/77949
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [lldb] Implement WebAssembly debugging (PR #77949)

2024-01-25 Thread Xu Jun via lldb-commits

xujuntwt95329 wrote:

> I already tried to use `register read` to access Wasm variables without 
> success. But it was the patch available 
> [here](https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/build-scripts/lldb_wasm.patch)
>  is part of WAMR, maybe this patch is different and will make it work. I 
> tried using this patch without success.
> 
> It seems like compiling lldb with this patch and use it to debug Wasm with 
> the latest version of WAMR / iwasm do not work correctly (at least for me). 
> lldb can connect to the server embedded into iwasm but doesn't seem to be 
> able to disassemble Wasm bytecode or set breakpoints. Just giving the info 
> here because I'm not sure this patch is supposed to be straightaway 
> compatible with WAMR.

Hi @mh4ck-Thales this is caused by 
https://github.com/llvm/llvm-project/pull/77949#discussion_r1463458728, 
currently we need to modify it manually. 

And since this PR is not merged, you need to use the [wasm_debug_2024] branch 
in my forked WAMR repo for testing: 
https://github.com/xujuntwt95329/wasm-micro-runtime/tree/wasm_debug_2024

https://github.com/llvm/llvm-project/pull/77949
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [libunwind] [libcxx] [flang] [compiler-rt] [clang-tools-extra] [libc] [lldb] [llvm] [lld] [X86] Use RORX over SHR imm (PR #77964)

2024-01-25 Thread Simon Pilgrim via lldb-commits


@@ -4216,6 +4217,97 @@ MachineSDNode *X86DAGToDAGISel::emitPCMPESTR(unsigned 
ROpc, unsigned MOpc,
   return CNode;
 }
 
+// When the consumer of a right shift (arithmetic or logical) wouldn't notice
+// the difference if the instruction was a rotate right instead (because the
+// bits shifted in are truncated away), the shift can be replaced by the RORX
+// instruction from BMI2. This doesn't set flags and can output to a different
+// register. However, this increases code size in most cases, and doesn't leave
+// the high bits in a useful state. There may be other situations where this
+// transformation is profitable given those conditions, but currently the
+// transformation is only made when it likely avoids spilling flags.
+bool X86DAGToDAGISel::rightShiftUncloberFlags(SDNode *N) {
+  EVT VT = N->getValueType(0);
+
+  // Target has to have BMI2 for RORX
+  if (!Subtarget->hasBMI2())
+return false;
+
+  // Only handle scalar shifts.
+  if (VT.isVector())
+return false;
+
+  unsigned OpSize;
+  if (VT == MVT::i64)
+OpSize = 64;
+  else if (VT == MVT::i32)
+OpSize = 32;
+  else if (VT == MVT::i16)
+OpSize = 16;
+  else if (VT == MVT::i8)
+return false; // i8 shift can't be truncated.
+  else
+llvm_unreachable("Unexpected shift size");
+
+  unsigned TruncateSize = 0;
+  // This only works when the result is truncated.
+  for (const SDNode *User : N->uses()) {
+auto name = User->getOperationName(CurDAG);
+if (!User->isMachineOpcode() ||
+User->getMachineOpcode() != TargetOpcode::EXTRACT_SUBREG)
+  return false;
+EVT TuncateType = User->getValueType(0);
+if (TuncateType == MVT::i32)
+  TruncateSize = std::max(TruncateSize, 32U);
+else if (TuncateType == MVT::i16)
+  TruncateSize = std::max(TruncateSize, 16U);
+else if (TuncateType == MVT::i8)
+  TruncateSize = std::max(TruncateSize, 8U);
+else
+  return false;
+  }
+  if (TruncateSize >= OpSize)
+return false;
+
+  // The shift must be by an immediate that wouldn't expose the zero or sign
+  // extended result.
+  auto *ShiftAmount = dyn_cast(N->getOperand(1));
+  if (!ShiftAmount || ShiftAmount->getZExtValue() > OpSize - TruncateSize)
+return false;
+
+  // Only make the replacement when it avoids clobbering used flags. This is a
+  // similar heuristic as used in the conversion to LEA, namely looking at the
+  // operand for an instruction that creates flags where those flags are used.
+  // This will have both false positives and false negatives. Ideally, both of
+  // these happen later on. Perhaps in copy to flags lowering or in register
+  // allocation.
+  bool MightClobberFlags = false;
+  SDNode *Input = N->getOperand(0).getNode();
+  for (auto Use : Input->uses()) {
+if (Use->getOpcode() == ISD::CopyToReg) {
+  auto *RegisterNode =
+  dyn_cast(Use->getOperand(1).getNode());
+  if (RegisterNode && RegisterNode->getReg() == X86::EFLAGS) {
+MightClobberFlags = true;
+break;
+  }
+}
+  }
+  if (!MightClobberFlags)
+return false;

RKSimon wrote:

Is this correct? The logic appears to be flipped.

https://github.com/llvm/llvm-project/pull/77964
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libc] [clang-tools-extra] [llvm] [compiler-rt] [clang] [lldb] [lld] [flang] [libcxx] [libunwind] [X86] Use RORX over SHR imm (PR #77964)

2024-01-25 Thread Simon Pilgrim via lldb-commits

https://github.com/RKSimon edited 
https://github.com/llvm/llvm-project/pull/77964
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libunwind] [clang-tools-extra] [libc] [flang] [lldb] [lld] [compiler-rt] [libcxx] [llvm] [clang] [X86] Use RORX over SHR imm (PR #77964)

2024-01-25 Thread Simon Pilgrim via lldb-commits


@@ -4216,6 +4217,97 @@ MachineSDNode *X86DAGToDAGISel::emitPCMPESTR(unsigned 
ROpc, unsigned MOpc,
   return CNode;
 }
 
+// When the consumer of a right shift (arithmetic or logical) wouldn't notice
+// the difference if the instruction was a rotate right instead (because the
+// bits shifted in are truncated away), the shift can be replaced by the RORX
+// instruction from BMI2. This doesn't set flags and can output to a different
+// register. However, this increases code size in most cases, and doesn't leave
+// the high bits in a useful state. There may be other situations where this
+// transformation is profitable given those conditions, but currently the
+// transformation is only made when it likely avoids spilling flags.
+bool X86DAGToDAGISel::rightShiftUncloberFlags(SDNode *N) {
+  EVT VT = N->getValueType(0);
+
+  // Target has to have BMI2 for RORX
+  if (!Subtarget->hasBMI2())
+return false;
+
+  // Only handle scalar shifts.
+  if (VT.isVector())
+return false;
+
+  unsigned OpSize;
+  if (VT == MVT::i64)
+OpSize = 64;
+  else if (VT == MVT::i32)
+OpSize = 32;
+  else if (VT == MVT::i16)
+OpSize = 16;
+  else if (VT == MVT::i8)
+return false; // i8 shift can't be truncated.
+  else
+llvm_unreachable("Unexpected shift size");
+
+  unsigned TruncateSize = 0;
+  // This only works when the result is truncated.
+  for (const SDNode *User : N->uses()) {
+auto name = User->getOperationName(CurDAG);

RKSimon wrote:

unused variable

https://github.com/llvm/llvm-project/pull/77964
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libc] [lld] [clang-tools-extra] [libcxx] [libunwind] [compiler-rt] [lldb] [flang] [llvm] [clang] [X86] Use RORX over SHR imm (PR #77964)

2024-01-25 Thread Simon Pilgrim via lldb-commits

https://github.com/RKSimon requested changes to this pull request.


https://github.com/llvm/llvm-project/pull/77964
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang-tools-extra] [libcxx] [flang] [lldb] [llvm] [libunwind] [compiler-rt] [lld] [libc] [clang] [X86] Use RORX over SHR imm (PR #77964)

2024-01-25 Thread Simon Pilgrim via lldb-commits


@@ -4216,6 +4217,97 @@ MachineSDNode *X86DAGToDAGISel::emitPCMPESTR(unsigned 
ROpc, unsigned MOpc,
   return CNode;
 }
 
+// When the consumer of a right shift (arithmetic or logical) wouldn't notice
+// the difference if the instruction was a rotate right instead (because the
+// bits shifted in are truncated away), the shift can be replaced by the RORX
+// instruction from BMI2. This doesn't set flags and can output to a different
+// register. However, this increases code size in most cases, and doesn't leave
+// the high bits in a useful state. There may be other situations where this
+// transformation is profitable given those conditions, but currently the
+// transformation is only made when it likely avoids spilling flags.
+bool X86DAGToDAGISel::rightShiftUncloberFlags(SDNode *N) {

RKSimon wrote:

typo: rightShiftUncloberFlags -> rightShiftUnclobberFlags

https://github.com/llvm/llvm-project/pull/77964
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] 03e4070 - [lldb] Silence warning when building with Clang ToT

2024-01-25 Thread Alexandre Ganea via lldb-commits

Author: Alexandre Ganea
Date: 2024-01-25T09:34:18-05:00
New Revision: 03e4070ce1f834eb426aa8f8622838c40ff5c710

URL: 
https://github.com/llvm/llvm-project/commit/03e4070ce1f834eb426aa8f8622838c40ff5c710
DIFF: 
https://github.com/llvm/llvm-project/commit/03e4070ce1f834eb426aa8f8622838c40ff5c710.diff

LOG: [lldb] Silence warning when building with Clang ToT

This fixes:
```
[6331/7452] Building CXX object 
tools\lldb\source\Plugins\Language\CPlusPlus\CMakeFiles\lldbPluginCPlusPlusLanguage.dir\LibCxx.cpp.obj
C:\git\llvm-project\lldb\source\Plugins\Language\CPlusPlus\LibCxx.cpp(1108,38): 
warning: format specifies type 'long' but the argument has type 'std::time_t' 
(aka 'long long') [-Wformat]
 1108 | stream.Printf("timestamp=%ld s", seconds);
  |  ~~~ ^~~
  |  %lld
C:\git\llvm-project\lldb\source\Plugins\Language\CPlusPlus\LibCxx.cpp(1116,63): 
warning: format specifies type 'long' but the argument has type 'std::time_t' 
(aka 'long long') [-Wformat]
 1116 | stream.Printf("date/time=%s timestamp=%ld s", str.data(), seconds);
  |   ~~~ ^~~
  |   %lld
2 warnings generated.
```

Added: 


Modified: 
lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp

Removed: 




diff  --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp 
b/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp
index 060324e2fcfe26..c5bed2cee81507 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp
@@ -1105,7 +1105,7 @@ bool 
lldb_private::formatters::LibcxxChronoSysSecondsSummaryProvider(
 
   const std::time_t seconds = ptr_sp->GetValueAsSigned(0);
   if (seconds < chrono_timestamp_min || seconds > chrono_timestamp_max)
-stream.Printf("timestamp=%ld s", seconds);
+stream.Printf("timestamp=%lld s", seconds);
   else {
 std::array str;
 std::size_t size =
@@ -1113,7 +1113,7 @@ bool 
lldb_private::formatters::LibcxxChronoSysSecondsSummaryProvider(
 if (size == 0)
   return false;
 
-stream.Printf("date/time=%s timestamp=%ld s", str.data(), seconds);
+stream.Printf("date/time=%s timestamp=%lld s", str.data(), seconds);
   }
 
   return true;



___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libcxx] [clang] [lldb] [lld] [libc] [clang-tools-extra] [flang] [llvm] [compiler-rt] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread via lldb-commits


@@ -614,6 +614,61 @@ void VPBasicBlock::print(raw_ostream &O, const Twine 
&Indent,
   printSuccessors(O, Indent);
 }
 #endif
+static void cloneCFG(VPBlockBase *Entry,
+ DenseMap &Old2NewVPBlocks);
+
+static VPBlockBase *cloneVPB(VPBlockBase *BB) {

ayalz wrote:

`cloneVPB()` seems to better fit as a virtual method `VPBlockBase::clone()`?

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang-tools-extra] [clang] [llvm] [libc] [flang] [lld] [lldb] [libcxx] [compiler-rt] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread via lldb-commits


@@ -614,6 +614,61 @@ void VPBasicBlock::print(raw_ostream &O, const Twine 
&Indent,
   printSuccessors(O, Indent);
 }
 #endif
+static void cloneCFG(VPBlockBase *Entry,
+ DenseMap &Old2NewVPBlocks);

ayalz wrote:

This cloning is recursive, so perhaps more accurately called `cloneHCFG()`.
It also clones a (maximal(*)) SESE region, so could be called `cloneSESE()`, 
and return a pair of VPB's - the new entry and new exit? Could 
`Old2NewVPBlocks` map be internal to the function, rather than an in/out 
parameter?

(*) There are currently two callees: the top-level of VPlan consisting of the 
vector-loop-preheader - loop-region - middle-block, and the internal CFG of a 
region from its entry/header to its exit(ing)/latch.

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libc] [clang-tools-extra] [flang] [llvm] [lldb] [clang] [libcxx] [compiler-rt] [lld] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread via lldb-commits


@@ -1594,6 +1657,13 @@ class VPWidenPHIRecipe : public VPHeaderPHIRecipe {
   addOperand(Start);
   }
 
+  VPRecipeBase *clone() override {
+auto *Res = new VPWidenPHIRecipe(cast(getUnderlyingInstr()),

ayalz wrote:

Better mark it unreachable than have untested dead code, potentially 
misconsidered live?

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [lld] [compiler-rt] [libc] [libcxx] [flang] [llvm] [clang] [clang-tools-extra] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread via lldb-commits


@@ -982,6 +1037,94 @@ void VPlan::updateDominatorTree(DominatorTree *DT, 
BasicBlock *LoopHeaderBB,
   assert(DT->verify(DominatorTree::VerificationLevel::Fast));
 }
 
+static void remapOperands(VPBlockBase *Entry, VPBlockBase *NewEntry,
+  DenseMap &Old2NewVPValues) {
+  // Update the operands of all cloned recipes starting at NewEntry. This
+  // traverses all reachable blocks. This is done in two steps, to handle 
cycles
+  // in PHI recipes.
+  ReversePostOrderTraversal>
+  OldDeepRPOT(Entry);
+  ReversePostOrderTraversal>
+  NewDeepRPOT(NewEntry);
+  // First, collect all mappings from old to new VPValues defined by cloned
+  // recipes.
+  for (const auto &[OldBB, NewBB] :
+   zip(VPBlockUtils::blocksOnly(OldDeepRPOT),
+   VPBlockUtils::blocksOnly(NewDeepRPOT))) {
+assert(OldBB->getRecipeList().size() == NewBB->getRecipeList().size() &&
+   "blocks must have the same number of recipes");
+
+for (const auto &[OldR, NewR] : zip(*OldBB, *NewBB)) {
+  assert(OldR.getNumOperands() == NewR.getNumOperands() &&
+ "recipes must have the same number of operands");
+  assert(OldR.getNumDefinedValues() == NewR.getNumDefinedValues() &&
+ "recipes must define the same number of operands");
+  for (const auto &[OldV, NewV] :
+   zip(OldR.definedValues(), NewR.definedValues()))
+Old2NewVPValues[OldV] = NewV;
+}
+  }
+
+  // Update all operands to use cloned VPValues.
+  for (VPBasicBlock *NewBB :
+   VPBlockUtils::blocksOnly(NewDeepRPOT)) {
+for (VPRecipeBase &NewR : *NewBB)
+  for (unsigned I = 0, E = NewR.getNumOperands(); I != E; ++I) {
+VPValue *NewOp = Old2NewVPValues.lookup(NewR.getOperand(I));
+NewR.setOperand(I, NewOp);
+  }
+  }
+}
+
+VPlan *VPlan::clone() {
+  DenseMap Old2NewVPBlocks;
+  DenseMap Old2NewVPValues;
+
+  auto *NewPlan = new VPlan();
+
+  // Clone live-ins.
+  SmallVector NewLiveIns;
+  for (VPValue *OldLiveIn : VPLiveInsToFree) {
+VPValue *NewLiveIn = new VPValue(OldLiveIn->getLiveInIRValue());
+NewPlan->VPLiveInsToFree.push_back(NewLiveIn);
+Old2NewVPValues[OldLiveIn] = NewLiveIn;
+  }
+  Old2NewVPValues[&VectorTripCount] = &NewPlan->VectorTripCount;
+  Old2NewVPValues[&VFxUF] = &NewPlan->VFxUF;
+  if (BackedgeTakenCount) {
+NewPlan->BackedgeTakenCount = new VPValue();
+Old2NewVPValues[BackedgeTakenCount] = NewPlan->BackedgeTakenCount;
+  }
+  assert(TripCount && "trip count must be set");
+  if (TripCount->isLiveIn())
+Old2NewVPValues[TripCount] = new VPValue(TripCount->getLiveInIRValue());
+
+  // Clone blocks.
+  cloneCFG(Preheader, Old2NewVPBlocks);
+  cloneCFG(getEntry(), Old2NewVPBlocks);
+
+  auto *NewPreheader = cast(Old2NewVPBlocks[Preheader]);
+  remapOperands(Preheader, NewPreheader, Old2NewVPValues);
+  auto *NewEntry = cast(Old2NewVPBlocks[Entry]);
+  remapOperands(Entry, NewEntry, Old2NewVPValues);
+
+  // Clone live-outs.
+  for (const auto &[_, LO] : LiveOuts)
+NewPlan->addLiveOut(LO->getPhi(), Old2NewVPValues[LO->getOperand(0)]);
+
+  // Initialize fields of cloned VPlan.
+  NewPlan->Entry = NewEntry;
+  NewPlan->Preheader = NewPreheader;
+  NewEntry->setPlan(NewPlan);
+  NewPreheader->setPlan(NewPlan);
+  NewPlan->VFs = VFs;
+  NewPlan->UFs = UFs;
+  // TODO: Adjust names.
+  NewPlan->Name = Name;
+  NewPlan->TripCount = Old2NewVPValues[TripCount];

ayalz wrote:

nit: can assert that Old2NewVPValues contains TripCount.

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang-tools-extra] [clang] [llvm] [compiler-rt] [libcxx] [flang] [lld] [lldb] [libc] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread via lldb-commits


@@ -982,6 +1037,94 @@ void VPlan::updateDominatorTree(DominatorTree *DT, 
BasicBlock *LoopHeaderBB,
   assert(DT->verify(DominatorTree::VerificationLevel::Fast));
 }
 
+static void remapOperands(VPBlockBase *Entry, VPBlockBase *NewEntry,
+  DenseMap &Old2NewVPValues) {
+  // Update the operands of all cloned recipes starting at NewEntry. This
+  // traverses all reachable blocks. This is done in two steps, to handle 
cycles
+  // in PHI recipes.
+  ReversePostOrderTraversal>
+  OldDeepRPOT(Entry);
+  ReversePostOrderTraversal>
+  NewDeepRPOT(NewEntry);
+  // First, collect all mappings from old to new VPValues defined by cloned
+  // recipes.
+  for (const auto &[OldBB, NewBB] :
+   zip(VPBlockUtils::blocksOnly(OldDeepRPOT),
+   VPBlockUtils::blocksOnly(NewDeepRPOT))) {
+assert(OldBB->getRecipeList().size() == NewBB->getRecipeList().size() &&
+   "blocks must have the same number of recipes");
+
+for (const auto &[OldR, NewR] : zip(*OldBB, *NewBB)) {
+  assert(OldR.getNumOperands() == NewR.getNumOperands() &&
+ "recipes must have the same number of operands");
+  assert(OldR.getNumDefinedValues() == NewR.getNumDefinedValues() &&
+ "recipes must define the same number of operands");
+  for (const auto &[OldV, NewV] :
+   zip(OldR.definedValues(), NewR.definedValues()))
+Old2NewVPValues[OldV] = NewV;
+}
+  }
+
+  // Update all operands to use cloned VPValues.
+  for (VPBasicBlock *NewBB :
+   VPBlockUtils::blocksOnly(NewDeepRPOT)) {
+for (VPRecipeBase &NewR : *NewBB)
+  for (unsigned I = 0, E = NewR.getNumOperands(); I != E; ++I) {
+VPValue *NewOp = Old2NewVPValues.lookup(NewR.getOperand(I));
+NewR.setOperand(I, NewOp);
+  }
+  }
+}
+
+VPlan *VPlan::clone() {
+  DenseMap Old2NewVPBlocks;
+  DenseMap Old2NewVPValues;
+
+  auto *NewPlan = new VPlan();

ayalz wrote:

Perhaps blocks should be clones first, then values cloned and remapped. I.e.,
```
  auto *NewPreheader = getPreheader().clone(); // Currently a disconnected VPBB.
  auto [NewEntry, _ ] = cloneSESE(getEntry());
  auto *NewPlan = new VPlan(NewPreheader, NewEntry);
```

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lld] [llvm] [clang] [libc] [compiler-rt] [libcxx] [flang] [lldb] [clang-tools-extra] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread via lldb-commits


@@ -614,6 +614,61 @@ void VPBasicBlock::print(raw_ostream &O, const Twine 
&Indent,
   printSuccessors(O, Indent);
 }
 #endif
+static void cloneCFG(VPBlockBase *Entry,
+ DenseMap &Old2NewVPBlocks);
+
+static VPBlockBase *cloneVPB(VPBlockBase *BB) {
+  if (auto *VPBB = dyn_cast(BB)) {
+auto *NewBlock = new VPBasicBlock(VPBB->getName());
+for (VPRecipeBase &R : *VPBB)
+  NewBlock->appendRecipe(R.clone());
+return NewBlock;
+  }
+
+  auto *VPR = cast(BB);
+  DenseMap Old2NewVPBlocks;
+  DenseMap Old2NewVPValues;
+  cloneCFG(VPR->getEntry(), Old2NewVPBlocks);
+  VPBlockBase *NewEntry = Old2NewVPBlocks[VPR->getEntry()];
+  auto *NewRegion =
+  new VPRegionBlock(NewEntry, Old2NewVPBlocks[VPR->getExiting()],
+VPR->getName(), VPR->isReplicator());
+  for (VPBlockBase *Block : vp_depth_first_shallow(NewEntry))
+Block->setParent(NewRegion);
+  return NewRegion;
+}
+
+// Clone the CFG for all nodes reachable from \p Entry, this includes cloning
+// the blocks and their recipes. Operands of cloned recipes will NOT be 
updated.
+// Remapping of operands must be done separately.
+static void cloneCFG(VPBlockBase *Entry,
+ DenseMap &Old2NewVPBlocks) {
+  ReversePostOrderTraversal> 
RPOT(
+  Entry);
+  for (VPBlockBase *BB : RPOT) {
+VPBlockBase *NewBB = cloneVPB(BB);
+for (VPBlockBase *Pred : BB->getPredecessors())
+  VPBlockUtils::connectBlocks(Old2NewVPBlocks[Pred], NewBB);
+
+Old2NewVPBlocks[BB] = NewBB;
+  }
+
+#if !defined(NDEBUG)
+  // Verify that the order of predecessors and successors matches in the cloned
+  // version.
+  ReversePostOrderTraversal>
+  NewRPOT(Old2NewVPBlocks[Entry]);
+  for (const auto &[OldBB, NewBB] : zip(RPOT, NewRPOT)) {
+for (const auto &[OldPred, NewPred] :
+ zip(OldBB->getPredecessors(), NewBB->getPredecessors()))
+  assert(NewPred == Old2NewVPBlocks[OldPred] && "Different predecessors");
+
+for (const auto &[OldSucc, NewSucc] :
+ zip(OldBB->successors(), NewBB->successors()))
+  assert(NewSucc == Old2NewVPBlocks[OldSucc] && "Different successors");
+  }
+#endif

ayalz wrote:

Return `Old2NewVPBlocks[Entry]` along with (the last) `NewBB`, or along with 
`Old2NewVPBlocks[Exit]` where Exit = *RPOT->end()?

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [libc] [clang-tools-extra] [compiler-rt] [flang] [llvm] [libcxx] [lld] [lldb] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread via lldb-commits


@@ -614,6 +614,61 @@ void VPBasicBlock::print(raw_ostream &O, const Twine 
&Indent,
   printSuccessors(O, Indent);
 }
 #endif
+static void cloneCFG(VPBlockBase *Entry,
+ DenseMap &Old2NewVPBlocks);
+
+static VPBlockBase *cloneVPB(VPBlockBase *BB) {
+  if (auto *VPBB = dyn_cast(BB)) {
+auto *NewBlock = new VPBasicBlock(VPBB->getName());
+for (VPRecipeBase &R : *VPBB)
+  NewBlock->appendRecipe(R.clone());
+return NewBlock;
+  }
+
+  auto *VPR = cast(BB);
+  DenseMap Old2NewVPBlocks;
+  DenseMap Old2NewVPValues;

ayalz wrote:

Is `Old2NewVPValues` needed here?

Hopefully `Old2NewVPBlocks` will not be needed either, doing instead something 
like:
```
  auto [NewEntry, NewExiting] = cloneSESE(VPR->getEntry());
```


https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang-tools-extra] [flang] [libcxx] [llvm] [lldb] [lld] [clang] [compiler-rt] [libc] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread via lldb-commits


@@ -982,6 +1037,94 @@ void VPlan::updateDominatorTree(DominatorTree *DT, 
BasicBlock *LoopHeaderBB,
   assert(DT->verify(DominatorTree::VerificationLevel::Fast));
 }
 
+static void remapOperands(VPBlockBase *Entry, VPBlockBase *NewEntry,
+  DenseMap &Old2NewVPValues) {
+  // Update the operands of all cloned recipes starting at NewEntry. This
+  // traverses all reachable blocks. This is done in two steps, to handle 
cycles
+  // in PHI recipes.
+  ReversePostOrderTraversal>
+  OldDeepRPOT(Entry);
+  ReversePostOrderTraversal>
+  NewDeepRPOT(NewEntry);
+  // First, collect all mappings from old to new VPValues defined by cloned
+  // recipes.
+  for (const auto &[OldBB, NewBB] :
+   zip(VPBlockUtils::blocksOnly(OldDeepRPOT),
+   VPBlockUtils::blocksOnly(NewDeepRPOT))) {
+assert(OldBB->getRecipeList().size() == NewBB->getRecipeList().size() &&
+   "blocks must have the same number of recipes");
+
+for (const auto &[OldR, NewR] : zip(*OldBB, *NewBB)) {
+  assert(OldR.getNumOperands() == NewR.getNumOperands() &&
+ "recipes must have the same number of operands");
+  assert(OldR.getNumDefinedValues() == NewR.getNumDefinedValues() &&
+ "recipes must define the same number of operands");
+  for (const auto &[OldV, NewV] :
+   zip(OldR.definedValues(), NewR.definedValues()))
+Old2NewVPValues[OldV] = NewV;
+}
+  }
+
+  // Update all operands to use cloned VPValues.
+  for (VPBasicBlock *NewBB :
+   VPBlockUtils::blocksOnly(NewDeepRPOT)) {
+for (VPRecipeBase &NewR : *NewBB)
+  for (unsigned I = 0, E = NewR.getNumOperands(); I != E; ++I) {
+VPValue *NewOp = Old2NewVPValues.lookup(NewR.getOperand(I));
+NewR.setOperand(I, NewOp);
+  }
+  }
+}
+
+VPlan *VPlan::clone() {
+  DenseMap Old2NewVPBlocks;
+  DenseMap Old2NewVPValues;
+
+  auto *NewPlan = new VPlan();
+
+  // Clone live-ins.
+  SmallVector NewLiveIns;

ayalz wrote:

Is `NewLiveIns` needed?

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libc] [clang] [lldb] [flang] [clang-tools-extra] [libcxx] [llvm] [lld] [compiler-rt] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread via lldb-commits


@@ -982,6 +1037,94 @@ void VPlan::updateDominatorTree(DominatorTree *DT, 
BasicBlock *LoopHeaderBB,
   assert(DT->verify(DominatorTree::VerificationLevel::Fast));
 }
 
+static void remapOperands(VPBlockBase *Entry, VPBlockBase *NewEntry,
+  DenseMap &Old2NewVPValues) {
+  // Update the operands of all cloned recipes starting at NewEntry. This
+  // traverses all reachable blocks. This is done in two steps, to handle 
cycles
+  // in PHI recipes.
+  ReversePostOrderTraversal>
+  OldDeepRPOT(Entry);
+  ReversePostOrderTraversal>
+  NewDeepRPOT(NewEntry);
+  // First, collect all mappings from old to new VPValues defined by cloned
+  // recipes.
+  for (const auto &[OldBB, NewBB] :
+   zip(VPBlockUtils::blocksOnly(OldDeepRPOT),
+   VPBlockUtils::blocksOnly(NewDeepRPOT))) {
+assert(OldBB->getRecipeList().size() == NewBB->getRecipeList().size() &&
+   "blocks must have the same number of recipes");
+
+for (const auto &[OldR, NewR] : zip(*OldBB, *NewBB)) {
+  assert(OldR.getNumOperands() == NewR.getNumOperands() &&
+ "recipes must have the same number of operands");
+  assert(OldR.getNumDefinedValues() == NewR.getNumDefinedValues() &&
+ "recipes must define the same number of operands");
+  for (const auto &[OldV, NewV] :
+   zip(OldR.definedValues(), NewR.definedValues()))
+Old2NewVPValues[OldV] = NewV;
+}
+  }
+
+  // Update all operands to use cloned VPValues.
+  for (VPBasicBlock *NewBB :
+   VPBlockUtils::blocksOnly(NewDeepRPOT)) {
+for (VPRecipeBase &NewR : *NewBB)
+  for (unsigned I = 0, E = NewR.getNumOperands(); I != E; ++I) {
+VPValue *NewOp = Old2NewVPValues.lookup(NewR.getOperand(I));
+NewR.setOperand(I, NewOp);
+  }
+  }
+}
+
+VPlan *VPlan::clone() {
+  DenseMap Old2NewVPBlocks;
+  DenseMap Old2NewVPValues;
+
+  auto *NewPlan = new VPlan();
+
+  // Clone live-ins.
+  SmallVector NewLiveIns;
+  for (VPValue *OldLiveIn : VPLiveInsToFree) {
+VPValue *NewLiveIn = new VPValue(OldLiveIn->getLiveInIRValue());
+NewPlan->VPLiveInsToFree.push_back(NewLiveIn);
+Old2NewVPValues[OldLiveIn] = NewLiveIn;
+  }
+  Old2NewVPValues[&VectorTripCount] = &NewPlan->VectorTripCount;
+  Old2NewVPValues[&VFxUF] = &NewPlan->VFxUF;
+  if (BackedgeTakenCount) {
+NewPlan->BackedgeTakenCount = new VPValue();
+Old2NewVPValues[BackedgeTakenCount] = NewPlan->BackedgeTakenCount;
+  }
+  assert(TripCount && "trip count must be set");
+  if (TripCount->isLiveIn())
+Old2NewVPValues[TripCount] = new VPValue(TripCount->getLiveInIRValue());

ayalz wrote:

// else NewTripCount will be created and inserted into `Old2NewVPValues` when 
`TripCount` is cloned. In any case `NewPlan->TripCount` is updated below.

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [libc] [llvm] [lldb] [clang-tools-extra] [lld] [flang] [compiler-rt] [libcxx] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread via lldb-commits


@@ -2694,6 +2852,9 @@ class VPlan {
   /// been modeled in VPlan directly.
   DenseMap SCEVToExpansion;
 
+  /// Construct an uninitialized VPlan, should be used for cloning only.
+  explicit VPlan() = default;
+

ayalz wrote:

Is it really needed? (See above)

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [lldb] Implement WebAssembly debugging (PR #77949)

2024-01-25 Thread Quentin Michaud via lldb-commits

mh4ck-Thales wrote:

> Hi @mh4ck-Thales this is caused by [#77949 
> (comment)](https://github.com/llvm/llvm-project/pull/77949#discussion_r1463458728),
>  currently we need to modify it manually.

Thanks! That did the trick for the breakpoint and disassembly problems. When 
using `read register` I can only see `pc` and nothing else though. I'm not sure 
assimilating Wasm variables to registers is the good way to go anyway, because 
the number of Wasm variables is not fixed in advance, and subject to the 
context of execution (with local variables). This is not the case at all for 
classic CPU registers, and I'm not sure the generic code managing registers in 
lldb will support that. Using a solution like @jimingham proposed with 
subcommands of `language wasm` may be a easier, less bug prone way to implement 
this. 

https://github.com/llvm/llvm-project/pull/77949
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [flang] [libc] [libcxx] [clang-tools-extra] [lldb] [lld] [libunwind] [llvm] [compiler-rt] [X86] Use RORX over SHR imm (PR #77964)

2024-01-25 Thread Bryce Wilson via lldb-commits

https://github.com/Bryce-MW updated 
https://github.com/llvm/llvm-project/pull/77964

>From d4c312b9dbf447d0a53dda0e6cdc482bd908430b Mon Sep 17 00:00:00 2001
From: Bryce Wilson 
Date: Fri, 12 Jan 2024 16:01:32 -0600
Subject: [PATCH 01/15] [X86] Use RORX over SHR imm

---
 llvm/lib/Target/X86/X86InstrShiftRotate.td |  78 ++
 llvm/test/CodeGen/X86/atomic-unordered.ll  |   3 +-
 llvm/test/CodeGen/X86/bmi2.ll  |   6 +-
 llvm/test/CodeGen/X86/cmp-shiftX-maskX.ll  |   3 +-
 llvm/test/CodeGen/X86/pr35636.ll   |   4 +-
 llvm/test/CodeGen/X86/vector-trunc-ssat.ll | 116 ++---
 6 files changed, 143 insertions(+), 67 deletions(-)

diff --git a/llvm/lib/Target/X86/X86InstrShiftRotate.td 
b/llvm/lib/Target/X86/X86InstrShiftRotate.td
index f951894db1890cd..238e8e9b6e97f30 100644
--- a/llvm/lib/Target/X86/X86InstrShiftRotate.td
+++ b/llvm/lib/Target/X86/X86InstrShiftRotate.td
@@ -879,6 +879,26 @@ let Predicates = [HasBMI2, HasEGPR, In64BitMode] in {
   defm SHLX64 : bmi_shift<"shlx{q}", GR64, i64mem, "_EVEX">, T8, PD, REX_W, 
EVEX;
 }
 
+
+def immle16_8 : ImmLeaf;
+def immle32_8 : ImmLeaf;
+def immle64_8 : ImmLeaf;
+def immle32_16 : ImmLeaf;
+def immle64_16 : ImmLeaf;
+def immle64_32 : ImmLeaf;
+
 let Predicates = [HasBMI2] in {
   // Prefer RORX which is non-destructive and doesn't update EFLAGS.
   let AddedComplexity = 10 in {
@@ -891,6 +911,64 @@ let Predicates = [HasBMI2] in {
   (RORX32ri GR32:$src, (ROT32L2R_imm8 imm:$shamt))>;
 def : Pat<(rotl GR64:$src, (i8 imm:$shamt)),
   (RORX64ri GR64:$src, (ROT64L2R_imm8 imm:$shamt))>;
+
+// A right shift by less than a smaller register size that is then
+// truncated to that register size can be replaced by RORX to
+// preserve flags with the same execution cost
+
+def : Pat<(i8 (trunc (srl GR16:$src, (i8 immle16_8:$shamt,
+  (EXTRACT_SUBREG (RORX32ri (INSERT_SUBREG (i32 (IMPLICIT_DEF)), 
GR16:$src, sub_16bit), imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (sra GR16:$src, (i8 immle16_8:$shamt,
+  (EXTRACT_SUBREG (RORX32ri (INSERT_SUBREG (i32 (IMPLICIT_DEF)), 
GR16:$src, sub_16bit), imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (srl GR32:$src, (i8 immle32_8:$shamt,
+  (EXTRACT_SUBREG (RORX32ri GR32:$src, imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (sra GR32:$src, (i8 immle32_8:$shamt,
+  (EXTRACT_SUBREG (RORX32ri GR32:$src, imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (srl GR64:$src, (i8 immle64_8:$shamt,
+  (EXTRACT_SUBREG (RORX64ri GR64:$src, imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (sra GR64:$src, (i8 immle64_8:$shamt,
+  (EXTRACT_SUBREG (RORX64ri GR64:$src, imm:$shamt), sub_8bit)>;
+
+
+def : Pat<(i16 (trunc (srl GR32:$src, (i8 immle32_16:$shamt,
+  (EXTRACT_SUBREG (RORX32ri GR32:$src, imm:$shamt), sub_16bit)>;
+def : Pat<(i16 (trunc (sra GR32:$src, (i8 immle32_16:$shamt,
+  (EXTRACT_SUBREG (RORX32ri GR32:$src, imm:$shamt), sub_16bit)>;
+def : Pat<(i16 (trunc (srl GR64:$src, (i8 immle64_16:$shamt,
+  (EXTRACT_SUBREG (RORX64ri GR64:$src, imm:$shamt), sub_16bit)>;
+def : Pat<(i16 (trunc (sra GR64:$src, (i8 immle64_16:$shamt,
+  (EXTRACT_SUBREG (RORX64ri GR64:$src, imm:$shamt), sub_16bit)>;
+
+def : Pat<(i32 (trunc (srl GR64:$src, (i8 immle64_32:$shamt,
+  (EXTRACT_SUBREG (RORX64ri GR64:$src, imm:$shamt), sub_32bit)>;
+def : Pat<(i32 (trunc (sra GR64:$src, (i8 immle64_32:$shamt,
+  (EXTRACT_SUBREG (RORX64ri GR64:$src, imm:$shamt), sub_32bit)>;
+
+
+// Can't expand the load
+def : Pat<(i8 (trunc (srl (loadi32 addr:$src), (i8 immle32_8:$shamt,
+  (EXTRACT_SUBREG (RORX32mi addr:$src, imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (sra (loadi32 addr:$src), (i8 immle32_8:$shamt,
+  (EXTRACT_SUBREG (RORX32mi addr:$src, imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (srl (loadi64 addr:$src), (i8 immle64_8:$shamt,
+  (EXTRACT_SUBREG (RORX64mi addr:$src, imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (sra (loadi64 addr:$src), (i8 immle64_8:$shamt,
+  (EXTRACT_SUBREG (RORX64mi addr:$src, imm:$shamt), sub_8bit)>;
+
+
+def : Pat<(i16 (trunc (srl (loadi32 addr:$src), (i8 immle32_16:$shamt,
+  (EXTRACT_SUBREG (RORX32mi addr:$src, imm:$shamt), sub_16bit)>;
+def : Pat<(i16 (trunc (sra (loadi32 addr:$src), (i8 immle32_16:$shamt,
+  (EXTRACT_SUBREG (RORX32mi addr:$src, imm:$shamt), sub_16bit)>;
+def : Pat<(i16 (trunc (srl (loadi64 addr:$src), (i8 immle64_16:$shamt,
+  (EXTRACT_SUBREG (RORX64mi addr:$src, imm:$shamt), sub_16bit)>;
+def : Pat<(i16 (trunc (sra (loadi64 addr:$src), (i8 immle64_16:$shamt,
+  (EXTRACT_SUBREG (RORX64mi addr:$src, imm:$shamt), sub_16bit)>;
+
+def : Pat<(i32 (trunc (

[Lldb-commits] [clang] [flang] [libc] [libcxx] [clang-tools-extra] [lldb] [lld] [libunwind] [llvm] [compiler-rt] [X86] Use RORX over SHR imm (PR #77964)

2024-01-25 Thread Bryce Wilson via lldb-commits


@@ -4216,6 +4217,97 @@ MachineSDNode *X86DAGToDAGISel::emitPCMPESTR(unsigned 
ROpc, unsigned MOpc,
   return CNode;
 }
 
+// When the consumer of a right shift (arithmetic or logical) wouldn't notice
+// the difference if the instruction was a rotate right instead (because the
+// bits shifted in are truncated away), the shift can be replaced by the RORX
+// instruction from BMI2. This doesn't set flags and can output to a different
+// register. However, this increases code size in most cases, and doesn't leave
+// the high bits in a useful state. There may be other situations where this
+// transformation is profitable given those conditions, but currently the
+// transformation is only made when it likely avoids spilling flags.
+bool X86DAGToDAGISel::rightShiftUncloberFlags(SDNode *N) {
+  EVT VT = N->getValueType(0);
+
+  // Target has to have BMI2 for RORX
+  if (!Subtarget->hasBMI2())
+return false;
+
+  // Only handle scalar shifts.
+  if (VT.isVector())
+return false;
+
+  unsigned OpSize;
+  if (VT == MVT::i64)
+OpSize = 64;
+  else if (VT == MVT::i32)
+OpSize = 32;
+  else if (VT == MVT::i16)
+OpSize = 16;
+  else if (VT == MVT::i8)
+return false; // i8 shift can't be truncated.
+  else
+llvm_unreachable("Unexpected shift size");
+
+  unsigned TruncateSize = 0;
+  // This only works when the result is truncated.
+  for (const SDNode *User : N->uses()) {
+auto name = User->getOperationName(CurDAG);
+if (!User->isMachineOpcode() ||
+User->getMachineOpcode() != TargetOpcode::EXTRACT_SUBREG)
+  return false;
+EVT TuncateType = User->getValueType(0);
+if (TuncateType == MVT::i32)
+  TruncateSize = std::max(TruncateSize, 32U);
+else if (TuncateType == MVT::i16)
+  TruncateSize = std::max(TruncateSize, 16U);
+else if (TuncateType == MVT::i8)
+  TruncateSize = std::max(TruncateSize, 8U);
+else
+  return false;
+  }
+  if (TruncateSize >= OpSize)
+return false;
+
+  // The shift must be by an immediate that wouldn't expose the zero or sign
+  // extended result.
+  auto *ShiftAmount = dyn_cast(N->getOperand(1));
+  if (!ShiftAmount || ShiftAmount->getZExtValue() > OpSize - TruncateSize)
+return false;
+
+  // Only make the replacement when it avoids clobbering used flags. This is a
+  // similar heuristic as used in the conversion to LEA, namely looking at the
+  // operand for an instruction that creates flags where those flags are used.
+  // This will have both false positives and false negatives. Ideally, both of
+  // these happen later on. Perhaps in copy to flags lowering or in register
+  // allocation.
+  bool MightClobberFlags = false;
+  SDNode *Input = N->getOperand(0).getNode();
+  for (auto Use : Input->uses()) {
+if (Use->getOpcode() == ISD::CopyToReg) {
+  auto *RegisterNode =
+  dyn_cast(Use->getOperand(1).getNode());
+  if (RegisterNode && RegisterNode->getReg() == X86::EFLAGS) {
+MightClobberFlags = true;
+break;
+  }
+}
+  }
+  if (!MightClobberFlags)
+return false;

Bryce-MW wrote:

It should be correct? I've clarified the names / explanation a bit but it's 
possible that I got the logic wrong

https://github.com/llvm/llvm-project/pull/77964
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [llvm] [flang] [lldb] [compiler-rt] [lld] [libclc] [libcxxabi] [clang-tools-extra] [clang] [libcxx] [libc] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread via lldb-commits


@@ -491,19 +491,41 @@ void VPlanTransforms::removeDeadRecipes(VPlan &Plan) {
 
 static VPValue *createScalarIVSteps(VPlan &Plan, const InductionDescriptor &ID,
 ScalarEvolution &SE, Instruction *TruncI,
-Type *IVTy, VPValue *StartV,
-VPValue *Step) {
+VPValue *StartV, VPValue *Step) {
   VPBasicBlock *HeaderVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
   auto IP = HeaderVPBB->getFirstNonPhi();
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
-  Type *TruncTy = TruncI ? TruncI->getType() : IVTy;
   VPValue *BaseIV = CanonicalIV;
-  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step, TruncTy)) {
-BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step,
-   TruncI ? TruncI->getType() : nullptr);
+  VPTypeAnalysis TypeInfo(SE.getContext());
+  Type *StepTy = TypeInfo.inferScalarType(Step);
+  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step)) {
+BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step);
 HeaderVPBB->insert(BaseIV->getDefiningRecipe(), IP);
   }
 
+  // Truncate base induction if needed.
+  if (TruncI) {
+Type *TruncTy = TruncI->getType();
+assert(TypeInfo.inferScalarType(BaseIV)->getScalarSizeInBits() >
+   TruncTy->getScalarSizeInBits() &&
+   StepTy->isIntegerTy() && "Truncation requires an integer step");
+auto *T = new VPScalarCastRecipe(Instruction::Trunc, BaseIV, TruncTy);
+HeaderVPBB->insert(T, IP);
+BaseIV = T;

ayalz wrote:

```suggestion
ResultTy = TruncTy;
BaseIV = new VPScalarCastRecipe(Instruction::Trunc, BaseIV, ResultTy);
HeaderVPBB->insert(BaseIV, IP);
```
?

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libclc] [libcxx] [clang-tools-extra] [llvm] [lld] [libc] [clang] [libcxxabi] [lldb] [flang] [compiler-rt] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread via lldb-commits


@@ -491,19 +491,41 @@ void VPlanTransforms::removeDeadRecipes(VPlan &Plan) {
 
 static VPValue *createScalarIVSteps(VPlan &Plan, const InductionDescriptor &ID,
 ScalarEvolution &SE, Instruction *TruncI,
-Type *IVTy, VPValue *StartV,
-VPValue *Step) {
+VPValue *StartV, VPValue *Step) {
   VPBasicBlock *HeaderVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
   auto IP = HeaderVPBB->getFirstNonPhi();
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
-  Type *TruncTy = TruncI ? TruncI->getType() : IVTy;
   VPValue *BaseIV = CanonicalIV;
-  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step, TruncTy)) {
-BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step,
-   TruncI ? TruncI->getType() : nullptr);
+  VPTypeAnalysis TypeInfo(SE.getContext());
+  Type *StepTy = TypeInfo.inferScalarType(Step);
+  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step)) {
+BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step);
 HeaderVPBB->insert(BaseIV->getDefiningRecipe(), IP);
   }
 
+  // Truncate base induction if needed.
+  if (TruncI) {
+Type *TruncTy = TruncI->getType();
+assert(TypeInfo.inferScalarType(BaseIV)->getScalarSizeInBits() >
+   TruncTy->getScalarSizeInBits() &&
+   StepTy->isIntegerTy() && "Truncation requires an integer step");
+auto *T = new VPScalarCastRecipe(Instruction::Trunc, BaseIV, TruncTy);
+HeaderVPBB->insert(T, IP);
+BaseIV = T;
+  }
+
+  // Truncate step if needed.
+  Type *BaseIVTy = TypeInfo.inferScalarType(BaseIV);

ayalz wrote:

```suggestion
  Type *StepTy = TypeInfo.inferScalarType(Step);
```

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang-tools-extra] [libc] [flang] [clang] [lldb] [llvm] [libcxxabi] [libclc] [lld] [libcxx] [compiler-rt] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread via lldb-commits


@@ -491,19 +491,41 @@ void VPlanTransforms::removeDeadRecipes(VPlan &Plan) {
 
 static VPValue *createScalarIVSteps(VPlan &Plan, const InductionDescriptor &ID,
 ScalarEvolution &SE, Instruction *TruncI,
-Type *IVTy, VPValue *StartV,
-VPValue *Step) {
+VPValue *StartV, VPValue *Step) {
   VPBasicBlock *HeaderVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
   auto IP = HeaderVPBB->getFirstNonPhi();
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
-  Type *TruncTy = TruncI ? TruncI->getType() : IVTy;
   VPValue *BaseIV = CanonicalIV;
-  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step, TruncTy)) {
-BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step,
-   TruncI ? TruncI->getType() : nullptr);
+  VPTypeAnalysis TypeInfo(SE.getContext());
+  Type *StepTy = TypeInfo.inferScalarType(Step);

ayalz wrote:

```suggestion
```

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [libcxxabi] [lldb] [llvm] [lld] [clang-tools-extra] [libcxx] [libclc] [compiler-rt] [libc] [flang] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread via lldb-commits


@@ -491,19 +491,41 @@ void VPlanTransforms::removeDeadRecipes(VPlan &Plan) {
 
 static VPValue *createScalarIVSteps(VPlan &Plan, const InductionDescriptor &ID,
 ScalarEvolution &SE, Instruction *TruncI,
-Type *IVTy, VPValue *StartV,
-VPValue *Step) {
+VPValue *StartV, VPValue *Step) {
   VPBasicBlock *HeaderVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
   auto IP = HeaderVPBB->getFirstNonPhi();
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
-  Type *TruncTy = TruncI ? TruncI->getType() : IVTy;
   VPValue *BaseIV = CanonicalIV;
-  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step, TruncTy)) {
-BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step,
-   TruncI ? TruncI->getType() : nullptr);
+  VPTypeAnalysis TypeInfo(SE.getContext());
+  Type *StepTy = TypeInfo.inferScalarType(Step);
+  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step)) {
+BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step);
 HeaderVPBB->insert(BaseIV->getDefiningRecipe(), IP);
   }
 
+  // Truncate base induction if needed.

ayalz wrote:

```
  VPTypeAnalysis TypeInfo(SE.getContext());
  Type *ResultTy = TypeInfo.inferScalarType(BaseIV);
```
both BaseIV and Step are subject to redefinition and truncation, perhaps aim to 
define the type of the result which they may truncate to.

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libcxxabi] [flang] [libc] [llvm] [lld] [lldb] [compiler-rt] [clang-tools-extra] [clang] [libcxx] [libclc] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread via lldb-commits


@@ -491,19 +491,41 @@ void VPlanTransforms::removeDeadRecipes(VPlan &Plan) {
 
 static VPValue *createScalarIVSteps(VPlan &Plan, const InductionDescriptor &ID,
 ScalarEvolution &SE, Instruction *TruncI,
-Type *IVTy, VPValue *StartV,
-VPValue *Step) {
+VPValue *StartV, VPValue *Step) {
   VPBasicBlock *HeaderVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
   auto IP = HeaderVPBB->getFirstNonPhi();
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
-  Type *TruncTy = TruncI ? TruncI->getType() : IVTy;
   VPValue *BaseIV = CanonicalIV;

ayalz wrote:

nit: defining BaseIV as a VPSingleDefRecipe* rather than VPValue* will allow 
removal of ->getDefiningRecipe().

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [libcxxabi] [llvm] [libclc] [lld] [flang] [lldb] [libcxx] [libc] [clang-tools-extra] [compiler-rt] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread via lldb-commits


@@ -1469,6 +1461,52 @@ void VPReplicateRecipe::print(raw_ostream &O, const 
Twine &Indent,
 }
 #endif
 
+static bool isUniformAcrossVFsAndUFs(VPScalarCastRecipe *C) {
+  return C->isDefinedOutsideVectorRegions() ||
+ isa(C->getOperand(0)) ||
+ isa(C->getOperand(0));
+}
+
+Value *VPScalarCastRecipe ::generate(VPTransformState &State, unsigned Part) {
+  assert(vputils::onlyFirstLaneUsed(this) &&
+ "Codegen only implemented for first lane.");
+  switch (Opcode) {
+  case Instruction::SExt:
+  case Instruction::ZExt:

ayalz wrote:

Would be good to either remove or make a note that SExt and ZExt are currently 
unused/dead cases.

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [clang-tools-extra] [libcxx] [llvm] [lldb] [libcxxabi] [lld] [libc] [flang] [compiler-rt] [libclc] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread via lldb-commits


@@ -491,19 +491,41 @@ void VPlanTransforms::removeDeadRecipes(VPlan &Plan) {
 
 static VPValue *createScalarIVSteps(VPlan &Plan, const InductionDescriptor &ID,
 ScalarEvolution &SE, Instruction *TruncI,
-Type *IVTy, VPValue *StartV,
-VPValue *Step) {
+VPValue *StartV, VPValue *Step) {
   VPBasicBlock *HeaderVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
   auto IP = HeaderVPBB->getFirstNonPhi();
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
-  Type *TruncTy = TruncI ? TruncI->getType() : IVTy;
   VPValue *BaseIV = CanonicalIV;
-  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step, TruncTy)) {
-BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step,
-   TruncI ? TruncI->getType() : nullptr);
+  VPTypeAnalysis TypeInfo(SE.getContext());
+  Type *StepTy = TypeInfo.inferScalarType(Step);
+  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step)) {
+BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step);
 HeaderVPBB->insert(BaseIV->getDefiningRecipe(), IP);
   }
 
+  // Truncate base induction if needed.
+  if (TruncI) {
+Type *TruncTy = TruncI->getType();
+assert(TypeInfo.inferScalarType(BaseIV)->getScalarSizeInBits() >
+   TruncTy->getScalarSizeInBits() &&
+   StepTy->isIntegerTy() && "Truncation requires an integer step");

ayalz wrote:

```suggestion
assert(ResultTy->getScalarSizeInBits() >
   TruncTy->getScalarSizeInBits() &&
   ResultTy->isIntegerTy() && "Truncation requires an integer type");
```

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libclc] [clang] [libc] [lldb] [compiler-rt] [libcxx] [libcxxabi] [lld] [llvm] [flang] [clang-tools-extra] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread via lldb-commits

https://github.com/ayalz edited https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [lldb] [libc] [libcxx] [libclc] [lld] [llvm] [flang] [clang-tools-extra] [libcxxabi] [compiler-rt] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread via lldb-commits


@@ -491,19 +491,41 @@ void VPlanTransforms::removeDeadRecipes(VPlan &Plan) {
 
 static VPValue *createScalarIVSteps(VPlan &Plan, const InductionDescriptor &ID,
 ScalarEvolution &SE, Instruction *TruncI,
-Type *IVTy, VPValue *StartV,
-VPValue *Step) {
+VPValue *StartV, VPValue *Step) {
   VPBasicBlock *HeaderVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
   auto IP = HeaderVPBB->getFirstNonPhi();
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
-  Type *TruncTy = TruncI ? TruncI->getType() : IVTy;
   VPValue *BaseIV = CanonicalIV;
-  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step, TruncTy)) {
-BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step,
-   TruncI ? TruncI->getType() : nullptr);
+  VPTypeAnalysis TypeInfo(SE.getContext());
+  Type *StepTy = TypeInfo.inferScalarType(Step);
+  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step)) {
+BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step);
 HeaderVPBB->insert(BaseIV->getDefiningRecipe(), IP);
   }
 
+  // Truncate base induction if needed.
+  if (TruncI) {
+Type *TruncTy = TruncI->getType();
+assert(TypeInfo.inferScalarType(BaseIV)->getScalarSizeInBits() >
+   TruncTy->getScalarSizeInBits() &&
+   StepTy->isIntegerTy() && "Truncation requires an integer step");
+auto *T = new VPScalarCastRecipe(Instruction::Trunc, BaseIV, TruncTy);
+HeaderVPBB->insert(T, IP);
+BaseIV = T;
+  }
+
+  // Truncate step if needed.
+  Type *BaseIVTy = TypeInfo.inferScalarType(BaseIV);
+  if (BaseIVTy != StepTy) {
+assert(StepTy->getScalarSizeInBits() > BaseIVTy->getScalarSizeInBits() &&
+   "Not truncating.");

ayalz wrote:

```suggestion
  if (StepTy != ResultTy) {
assert(StepTy->getScalarSizeInBits() > ResultTy->getScalarSizeInBits() &&
  StepTy->isIntegerTy() && "Truncation requires an integer 
step");
```

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [flang] [libc] [libcxx] [libcxxabi] [clang-tools-extra] [lldb] [lld] [llvm] [libclc] [compiler-rt] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread via lldb-commits

https://github.com/ayalz commented:

Looks good to me, adding some minor suggestions.

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libcxx] [libcxxabi] [lldb] [lld] [llvm] [clang] [clang-tools-extra] [libc] [libclc] [flang] [compiler-rt] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread via lldb-commits


@@ -859,6 +859,7 @@ class VPSingleDefRecipe : public VPRecipeBase, public 
VPValue {
 case VPRecipeBase::VPWidenIntOrFpInductionSC:
 case VPRecipeBase::VPWidenPointerInductionSC:
 case VPRecipeBase::VPReductionPHISC:
+case VPRecipeBase::VPScalarCastSC:

ayalz wrote:

nit (independent of this patch): would be good to keep in order.

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [llvm] [libcxxabi] [lld] [libcxx] [clang] [libclc] [clang-tools-extra] [flang] [lldb] [libc] [compiler-rt] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread via lldb-commits


@@ -491,19 +491,41 @@ void VPlanTransforms::removeDeadRecipes(VPlan &Plan) {
 
 static VPValue *createScalarIVSteps(VPlan &Plan, const InductionDescriptor &ID,
 ScalarEvolution &SE, Instruction *TruncI,
-Type *IVTy, VPValue *StartV,
-VPValue *Step) {
+VPValue *StartV, VPValue *Step) {
   VPBasicBlock *HeaderVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
   auto IP = HeaderVPBB->getFirstNonPhi();
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
-  Type *TruncTy = TruncI ? TruncI->getType() : IVTy;
   VPValue *BaseIV = CanonicalIV;
-  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step, TruncTy)) {
-BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step,
-   TruncI ? TruncI->getType() : nullptr);
+  VPTypeAnalysis TypeInfo(SE.getContext());
+  Type *StepTy = TypeInfo.inferScalarType(Step);
+  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step)) {
+BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step);
 HeaderVPBB->insert(BaseIV->getDefiningRecipe(), IP);
   }
 
+  // Truncate base induction if needed.
+  if (TruncI) {
+Type *TruncTy = TruncI->getType();
+assert(TypeInfo.inferScalarType(BaseIV)->getScalarSizeInBits() >
+   TruncTy->getScalarSizeInBits() &&
+   StepTy->isIntegerTy() && "Truncation requires an integer step");
+auto *T = new VPScalarCastRecipe(Instruction::Trunc, BaseIV, TruncTy);
+HeaderVPBB->insert(T, IP);
+BaseIV = T;
+  }
+
+  // Truncate step if needed.
+  Type *BaseIVTy = TypeInfo.inferScalarType(BaseIV);
+  if (BaseIVTy != StepTy) {
+assert(StepTy->getScalarSizeInBits() > BaseIVTy->getScalarSizeInBits() &&
+   "Not truncating.");
+
+Step = new VPScalarCastRecipe(Instruction::Trunc, Step, BaseIVTy);

ayalz wrote:

```suggestion
Step = new VPScalarCastRecipe(Instruction::Trunc, Step, ResultTy);
```

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [compiler-rt] [clang-tools-extra] [clang] [libc] [libunwind] [lld] [libcxx] [flang] [mlir] [llvm] [ELF] Implement R_RISCV_TLSDESC for RISC-V (PR #79239)

2024-01-25 Thread Paul Kirth via lldb-commits


@@ -513,29 +547,125 @@ void RISCV::relocate(uint8_t *loc, const Relocation 
&rel, uint64_t val) const {
 break;
 
   case R_RISCV_RELAX:
-return; // Ignored (for now)
-
+return;
+  case R_RISCV_TLSDESC:
+// The addend is stored in the second word.
+if (config->is64)
+  write64le(loc + 8, val);
+else
+  write32le(loc + 4, val);
+break;
   default:
 llvm_unreachable("unknown relocation");
   }
 }
 
+static void tlsdescToIe(uint8_t *loc, const Relocation &rel, uint64_t val) {
+  switch (rel.type) {
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+write32le(loc, 0x0013); // nop
+break;
+  case R_RISCV_TLSDESC_ADD_LO12:
+write32le(loc, utype(AUIPC, X_A0, hi20(val))); // auipc a0,
+break;
+  case R_RISCV_TLSDESC_CALL:
+if (config->is64)
+  write32le(loc, itype(LD, X_A0, X_A0, lo12(val))); // ld a0,(a0)
+else
+  write32le(loc, itype(LW, X_A0, X_A0, lo12(val))); // lw a0,(a0)
+break;
+  default:
+llvm_unreachable("unsupported relocation for TLSDESC to IE");
+  }
+}
+
+static void tlsdescToLe(uint8_t *loc, const Relocation &rel, uint64_t val) {
+  switch (rel.type) {
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+write32le(loc, 0x0013); // nop
+return;
+  case R_RISCV_TLSDESC_ADD_LO12:
+if (isInt<12>(val))
+  write32le(loc, 0x0013); // nop
+else
+  write32le(loc, utype(LUI, X_A0, hi20(val))); // lui a0,
+return;
+  case R_RISCV_TLSDESC_CALL:
+if (isInt<12>(val))
+  write32le(loc, itype(ADDI, X_A0, 0, val)); // addi a0,zero,
+else
+  write32le(loc, itype(ADDI, X_A0, X_A0, lo12(val))); // addi a0,a0,
+return;
+  default:
+llvm_unreachable("unsupported relocation for TLSDESC to LE");
+  }
+}
+
 void RISCV::relocateAlloc(InputSectionBase &sec, uint8_t *buf) const {
   uint64_t secAddr = sec.getOutputSection()->addr;
   if (auto *s = dyn_cast(&sec))
 secAddr += s->outSecOff;
   else if (auto *ehIn = dyn_cast(&sec))
 secAddr += ehIn->getParent()->outSecOff;
-  for (size_t i = 0, size = sec.relocs().size(); i != size; ++i) {
-const Relocation &rel = sec.relocs()[i];
+  uint64_t tlsdescVal = 0;
+  bool isToLe = false;
+  const ArrayRef relocs = sec.relocs();
+  for (size_t i = 0, size = relocs.size(); i != size; ++i) {
+const Relocation &rel = relocs[i];
 uint8_t *loc = buf + rel.offset;
-const uint64_t val =
+uint64_t val =
 sec.getRelocTargetVA(sec.file, rel.type, rel.addend,
  secAddr + rel.offset, *rel.sym, rel.expr);
 
 switch (rel.expr) {
 case R_RELAX_HINT:
+  continue;
+case R_TLSDESC_PC:
+  // For R_RISCV_TLSDESC_HI20, store &got(sym)-PC to be used by the
+  // following two instructions L[DW] and ADDI.
+  if (rel.type == R_RISCV_TLSDESC_HI20)
+tlsdescVal = val;
+  else
+val = tlsdescVal;
   break;
+case R_RELAX_TLS_GD_TO_IE:
+  // Only R_RISCV_TLSDESC_HI20 reaches here. tlsdescVal will be finalized
+  // after we see R_RISCV_TLSDESC_ADD_LO12 in the R_RELAX_TLS_GD_TO_LE 
case.
+  // The net effect is that tlsdescVal will be smaller than `val` to take
+  // into account of NOP instructions (in the absence of R_RISCV_RELAX)
+  // before AUIPC.
+  tlsdescVal = val + rel.offset;
+  isToLe = false;
+  if (!(i + 1 != relocs.size() && relocs[i + 1].type == R_RISCV_RELAX))
+tlsdescToIe(loc, rel, val);
+  continue;
+case R_RELAX_TLS_GD_TO_LE:
+  // See the comment in handleTlsRelocation. For TLSDESC=>IE,
+  // R_RISCV_TLSDESC_{LOAD_LO12,ADD_LO12,CALL} also reach here. If isToIe 
is
+  // true, this is actually TLSDESC=>IE optimization.
+  if (rel.type == R_RISCV_TLSDESC_HI20) {
+tlsdescVal = val;
+isToLe = true;
+  } else {
+if (!isToLe && rel.type == R_RISCV_TLSDESC_ADD_LO12)
+  tlsdescVal -= rel.offset;
+val = tlsdescVal;
+  }
+  // When NOP conversion is eligible and R_RISCV_RELAX is present, don't
+  // write a NOP in case an unrelated instruction follows the current
+  // instruction.
+  if ((rel.type == R_RISCV_TLSDESC_HI20 ||
+   rel.type == R_RISCV_TLSDESC_LOAD_LO12 ||
+   (rel.type == R_RISCV_TLSDESC_ADD_LO12 && isToLe && !hi20(val))) &&
+  i + 1 != relocs.size() && relocs[i + 1].type == R_RISCV_RELAX)

ilovepi wrote:

This is a pretty complicated condition ... I know its a one off, but do you 
think it makes sense to use a helper, just for the readability aspect? Maybe 
`canReplaceTlLSDESCWithNop()` or `isTLSDESCRelocElegibleForNop()`?

It may make sense to at least use a helper for 
`!(i + 1 != relocs.size() && relocs[i + 1].type == R_RISCV_RELAX` since its 
used another time.

I'm fine either way, since this is a style choice(and is really a nit), but I 
think it would be easier to understand of some of the complexity was abstracted 

[Lldb-commits] [clang] [lldb] [libunwind] [libcxx] [compiler-rt] [libc] [flang] [lld] [llvm] [clang-tools-extra] [mlir] [ELF] Implement R_RISCV_TLSDESC for RISC-V (PR #79239)

2024-01-25 Thread Fangrui Song via lldb-commits


@@ -513,29 +547,125 @@ void RISCV::relocate(uint8_t *loc, const Relocation 
&rel, uint64_t val) const {
 break;
 
   case R_RISCV_RELAX:
-return; // Ignored (for now)
-
+return;
+  case R_RISCV_TLSDESC:
+// The addend is stored in the second word.
+if (config->is64)
+  write64le(loc + 8, val);
+else
+  write32le(loc + 4, val);
+break;
   default:
 llvm_unreachable("unknown relocation");
   }
 }
 
+static void tlsdescToIe(uint8_t *loc, const Relocation &rel, uint64_t val) {
+  switch (rel.type) {
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+write32le(loc, 0x0013); // nop
+break;
+  case R_RISCV_TLSDESC_ADD_LO12:
+write32le(loc, utype(AUIPC, X_A0, hi20(val))); // auipc a0,
+break;
+  case R_RISCV_TLSDESC_CALL:
+if (config->is64)
+  write32le(loc, itype(LD, X_A0, X_A0, lo12(val))); // ld a0,(a0)
+else
+  write32le(loc, itype(LW, X_A0, X_A0, lo12(val))); // lw a0,(a0)
+break;
+  default:
+llvm_unreachable("unsupported relocation for TLSDESC to IE");
+  }
+}
+
+static void tlsdescToLe(uint8_t *loc, const Relocation &rel, uint64_t val) {
+  switch (rel.type) {
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+write32le(loc, 0x0013); // nop
+return;
+  case R_RISCV_TLSDESC_ADD_LO12:
+if (isInt<12>(val))
+  write32le(loc, 0x0013); // nop
+else
+  write32le(loc, utype(LUI, X_A0, hi20(val))); // lui a0,
+return;
+  case R_RISCV_TLSDESC_CALL:
+if (isInt<12>(val))
+  write32le(loc, itype(ADDI, X_A0, 0, val)); // addi a0,zero,
+else
+  write32le(loc, itype(ADDI, X_A0, X_A0, lo12(val))); // addi a0,a0,
+return;
+  default:
+llvm_unreachable("unsupported relocation for TLSDESC to LE");
+  }
+}
+
 void RISCV::relocateAlloc(InputSectionBase &sec, uint8_t *buf) const {
   uint64_t secAddr = sec.getOutputSection()->addr;
   if (auto *s = dyn_cast(&sec))
 secAddr += s->outSecOff;
   else if (auto *ehIn = dyn_cast(&sec))
 secAddr += ehIn->getParent()->outSecOff;
-  for (size_t i = 0, size = sec.relocs().size(); i != size; ++i) {
-const Relocation &rel = sec.relocs()[i];
+  uint64_t tlsdescVal = 0;
+  bool isToLe = false;
+  const ArrayRef relocs = sec.relocs();
+  for (size_t i = 0, size = relocs.size(); i != size; ++i) {
+const Relocation &rel = relocs[i];
 uint8_t *loc = buf + rel.offset;
-const uint64_t val =
+uint64_t val =
 sec.getRelocTargetVA(sec.file, rel.type, rel.addend,
  secAddr + rel.offset, *rel.sym, rel.expr);
 
 switch (rel.expr) {
 case R_RELAX_HINT:
+  continue;
+case R_TLSDESC_PC:
+  // For R_RISCV_TLSDESC_HI20, store &got(sym)-PC to be used by the
+  // following two instructions L[DW] and ADDI.
+  if (rel.type == R_RISCV_TLSDESC_HI20)
+tlsdescVal = val;
+  else
+val = tlsdescVal;
   break;
+case R_RELAX_TLS_GD_TO_IE:
+  // Only R_RISCV_TLSDESC_HI20 reaches here. tlsdescVal will be finalized
+  // after we see R_RISCV_TLSDESC_ADD_LO12 in the R_RELAX_TLS_GD_TO_LE 
case.
+  // The net effect is that tlsdescVal will be smaller than `val` to take
+  // into account of NOP instructions (in the absence of R_RISCV_RELAX)
+  // before AUIPC.
+  tlsdescVal = val + rel.offset;
+  isToLe = false;
+  if (!(i + 1 != relocs.size() && relocs[i + 1].type == R_RISCV_RELAX))
+tlsdescToIe(loc, rel, val);
+  continue;
+case R_RELAX_TLS_GD_TO_LE:
+  // See the comment in handleTlsRelocation. For TLSDESC=>IE,
+  // R_RISCV_TLSDESC_{LOAD_LO12,ADD_LO12,CALL} also reach here. If isToIe 
is
+  // true, this is actually TLSDESC=>IE optimization.
+  if (rel.type == R_RISCV_TLSDESC_HI20) {
+tlsdescVal = val;
+isToLe = true;
+  } else {
+if (!isToLe && rel.type == R_RISCV_TLSDESC_ADD_LO12)
+  tlsdescVal -= rel.offset;
+val = tlsdescVal;
+  }
+  // When NOP conversion is eligible and R_RISCV_RELAX is present, don't
+  // write a NOP in case an unrelated instruction follows the current
+  // instruction.
+  if ((rel.type == R_RISCV_TLSDESC_HI20 ||
+   rel.type == R_RISCV_TLSDESC_LOAD_LO12 ||
+   (rel.type == R_RISCV_TLSDESC_ADD_LO12 && isToLe && !hi20(val))) &&
+  i + 1 != relocs.size() && relocs[i + 1].type == R_RISCV_RELAX)

MaskRay wrote:

I am thinking of a simplification where I only check whether 
R_RISCV_TLSDESC_HI20 has an associated R_RISCV_RELAX. If yes, apply relaxation 
whether or not the following 3 instructions has an associated R_RISCV_RELAX.

Then, I just use another variable to hold "whether there is R_RISCV_RELAX" and 
arguably the straight line code will be more readable than introducing a 
function call.

On the LLVM side, relaxation can be enabled by adding R_RISCV_RELAX to just the 
first instruction, decreasing the size bloat (sizeof(Elf6

[Lldb-commits] [lldb] [lldb][NFCI] Remove unused method BreakpointIDList::FindBreakpointID(const char *, size_t *) (PR #79215)

2024-01-25 Thread Felipe de Azevedo Piovezan via lldb-commits

https://github.com/felipepiovezan approved this pull request.


https://github.com/llvm/llvm-project/pull/79215
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [lldb] [libunwind] [libcxx] [compiler-rt] [libc] [flang] [lld] [llvm] [clang-tools-extra] [mlir] [ELF] Implement R_RISCV_TLSDESC for RISC-V (PR #79239)

2024-01-25 Thread Paul Kirth via lldb-commits


@@ -513,29 +547,125 @@ void RISCV::relocate(uint8_t *loc, const Relocation 
&rel, uint64_t val) const {
 break;
 
   case R_RISCV_RELAX:
-return; // Ignored (for now)
-
+return;
+  case R_RISCV_TLSDESC:
+// The addend is stored in the second word.
+if (config->is64)
+  write64le(loc + 8, val);
+else
+  write32le(loc + 4, val);
+break;
   default:
 llvm_unreachable("unknown relocation");
   }
 }
 
+static void tlsdescToIe(uint8_t *loc, const Relocation &rel, uint64_t val) {
+  switch (rel.type) {
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+write32le(loc, 0x0013); // nop
+break;
+  case R_RISCV_TLSDESC_ADD_LO12:
+write32le(loc, utype(AUIPC, X_A0, hi20(val))); // auipc a0,
+break;
+  case R_RISCV_TLSDESC_CALL:
+if (config->is64)
+  write32le(loc, itype(LD, X_A0, X_A0, lo12(val))); // ld a0,(a0)
+else
+  write32le(loc, itype(LW, X_A0, X_A0, lo12(val))); // lw a0,(a0)
+break;
+  default:
+llvm_unreachable("unsupported relocation for TLSDESC to IE");
+  }
+}
+
+static void tlsdescToLe(uint8_t *loc, const Relocation &rel, uint64_t val) {
+  switch (rel.type) {
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+write32le(loc, 0x0013); // nop
+return;
+  case R_RISCV_TLSDESC_ADD_LO12:
+if (isInt<12>(val))
+  write32le(loc, 0x0013); // nop
+else
+  write32le(loc, utype(LUI, X_A0, hi20(val))); // lui a0,
+return;
+  case R_RISCV_TLSDESC_CALL:
+if (isInt<12>(val))
+  write32le(loc, itype(ADDI, X_A0, 0, val)); // addi a0,zero,
+else
+  write32le(loc, itype(ADDI, X_A0, X_A0, lo12(val))); // addi a0,a0,
+return;
+  default:
+llvm_unreachable("unsupported relocation for TLSDESC to LE");
+  }
+}
+
 void RISCV::relocateAlloc(InputSectionBase &sec, uint8_t *buf) const {
   uint64_t secAddr = sec.getOutputSection()->addr;
   if (auto *s = dyn_cast(&sec))
 secAddr += s->outSecOff;
   else if (auto *ehIn = dyn_cast(&sec))
 secAddr += ehIn->getParent()->outSecOff;
-  for (size_t i = 0, size = sec.relocs().size(); i != size; ++i) {
-const Relocation &rel = sec.relocs()[i];
+  uint64_t tlsdescVal = 0;
+  bool isToLe = false;
+  const ArrayRef relocs = sec.relocs();
+  for (size_t i = 0, size = relocs.size(); i != size; ++i) {
+const Relocation &rel = relocs[i];
 uint8_t *loc = buf + rel.offset;
-const uint64_t val =
+uint64_t val =
 sec.getRelocTargetVA(sec.file, rel.type, rel.addend,
  secAddr + rel.offset, *rel.sym, rel.expr);
 
 switch (rel.expr) {
 case R_RELAX_HINT:
+  continue;
+case R_TLSDESC_PC:
+  // For R_RISCV_TLSDESC_HI20, store &got(sym)-PC to be used by the
+  // following two instructions L[DW] and ADDI.
+  if (rel.type == R_RISCV_TLSDESC_HI20)
+tlsdescVal = val;
+  else
+val = tlsdescVal;
   break;
+case R_RELAX_TLS_GD_TO_IE:
+  // Only R_RISCV_TLSDESC_HI20 reaches here. tlsdescVal will be finalized
+  // after we see R_RISCV_TLSDESC_ADD_LO12 in the R_RELAX_TLS_GD_TO_LE 
case.
+  // The net effect is that tlsdescVal will be smaller than `val` to take
+  // into account of NOP instructions (in the absence of R_RISCV_RELAX)
+  // before AUIPC.
+  tlsdescVal = val + rel.offset;
+  isToLe = false;
+  if (!(i + 1 != relocs.size() && relocs[i + 1].type == R_RISCV_RELAX))
+tlsdescToIe(loc, rel, val);
+  continue;
+case R_RELAX_TLS_GD_TO_LE:
+  // See the comment in handleTlsRelocation. For TLSDESC=>IE,
+  // R_RISCV_TLSDESC_{LOAD_LO12,ADD_LO12,CALL} also reach here. If isToIe 
is
+  // true, this is actually TLSDESC=>IE optimization.
+  if (rel.type == R_RISCV_TLSDESC_HI20) {
+tlsdescVal = val;
+isToLe = true;
+  } else {
+if (!isToLe && rel.type == R_RISCV_TLSDESC_ADD_LO12)
+  tlsdescVal -= rel.offset;
+val = tlsdescVal;
+  }
+  // When NOP conversion is eligible and R_RISCV_RELAX is present, don't
+  // write a NOP in case an unrelated instruction follows the current
+  // instruction.
+  if ((rel.type == R_RISCV_TLSDESC_HI20 ||
+   rel.type == R_RISCV_TLSDESC_LOAD_LO12 ||
+   (rel.type == R_RISCV_TLSDESC_ADD_LO12 && isToLe && !hi20(val))) &&
+  i + 1 != relocs.size() && relocs[i + 1].type == R_RISCV_RELAX)

ilovepi wrote:

> I am thinking of a simplification where I only check whether 
> R_RISCV_TLSDESC_HI20 has an associated R_RISCV_RELAX. If yes, apply 
> relaxation whether or not the following 3 instructions has an associated 
> R_RISCV_RELAX.
> 
> Then, I just use another variable to hold "whether there is R_RISCV_RELAX" 
> and arguably the straight line code will be more readable than introducing a 
> function call.
> 

That sounds like a nice approach.

> On the LLVM side, relaxation can be enabled by adding R_RISCV_RELAX to just 
> th

[Lldb-commits] [llvm] [libc] [lld] [libcxxabi] [clang-tools-extra] [clang] [flang] [compiler-rt] [libclc] [libcxx] [lldb] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/78113

>From 36b085f21b76d7bf7c9965a86a09d1cef4fe9329 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Sun, 14 Jan 2024 14:13:08 +
Subject: [PATCH 1/7] [VPlan] Add new VPUniformPerUFRecipe, use for step
 truncation.

Add a new recipe to model uniform-per-UF instructions, without relying
on an underlying instruction. Initially, it supports uniform cast-ops
and is therefore storing the result type.

Not relying on an underlying instruction (like the current
VPReplicateRecipe) allows to create instances without a corresponding
instruction.

In the future, to plan is to extend this recipe to handle all opcodes
needed to replace the uniform part of VPReplicateRecipe.
---
 llvm/lib/Transforms/Vectorize/VPlan.h | 30 
 .../Transforms/Vectorize/VPlanAnalysis.cpp|  6 ++-
 .../lib/Transforms/Vectorize/VPlanRecipes.cpp | 49 ---
 .../Transforms/Vectorize/VPlanTransforms.cpp  |  9 
 llvm/lib/Transforms/Vectorize/VPlanValue.h|  1 +
 .../LoopVectorize/cast-induction.ll   |  4 +-
 .../interleave-and-scalarize-only.ll  |  3 +-
 .../pr46525-expander-insertpoint.ll   |  2 +-
 8 files changed, 93 insertions(+), 11 deletions(-)

diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h 
b/llvm/lib/Transforms/Vectorize/VPlan.h
index 4b4f4911eb6415..d598522448 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -1945,6 +1945,36 @@ class VPReplicateRecipe : public VPRecipeWithIRFlags, 
public VPValue {
   }
 };
 
+/// VPUniformPerUFRecipe represents an instruction with Opcode that is uniform
+/// per UF, i.e. it generates a single scalar instance per UF.
+/// TODO: at the moment, only Cast opcodes are supported, extend to support
+///   missing opcodes to replace uniform part of VPReplicateRecipe.
+class VPUniformPerUFRecipe : public VPRecipeBase, public VPValue {
+  unsigned Opcode;
+
+  /// Result type for the cast.
+  Type *ResultTy;
+
+  Value *generate(VPTransformState &State, unsigned Part);
+
+public:
+  VPUniformPerUFRecipe(Instruction::CastOps Opcode, VPValue *Op, Type 
*ResultTy)
+  : VPRecipeBase(VPDef::VPUniformPerUFSC, {Op}), VPValue(this),
+Opcode(Opcode), ResultTy(ResultTy) {}
+
+  ~VPUniformPerUFRecipe() override = default;
+
+  VP_CLASSOF_IMPL(VPDef::VPWidenIntOrFpInductionSC)
+
+  void execute(VPTransformState &State) override;
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+  /// Print the recipe.
+  void print(raw_ostream &O, const Twine &Indent,
+ VPSlotTracker &SlotTracker) const override;
+#endif
+};
+
 /// A recipe for generating conditional branches on the bits of a mask.
 class VPBranchOnMaskRecipe : public VPRecipeBase {
 public:
diff --git a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp 
b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
index 97a8a1803bbf5a..d71b0703994450 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
@@ -230,7 +230,11 @@ Type *VPTypeAnalysis::inferScalarType(const VPValue *V) {
 return V->getUnderlyingValue()->getType();
   })
   .Case(
-  [](const VPWidenCastRecipe *R) { return R->getResultType(); });
+  [](const VPWidenCastRecipe *R) { return R->getResultType(); })
+  .Case([](const VPExpandSCEVRecipe *R) {
+return R->getSCEV()->getType();
+  });
+
   assert(ResultTy && "could not infer type for the given VPValue");
   CachedTypes[V] = ResultTy;
   return ResultTy;
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp 
b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 1f844bce23102e..423504e8f7e05e 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -164,6 +164,8 @@ bool VPRecipeBase::mayHaveSideEffects() const {
 auto *R = cast(this);
 return R->getUnderlyingInstr()->mayHaveSideEffects();
   }
+  case VPUniformPerUFSC:
+return false;
   default:
 return true;
   }
@@ -1117,13 +1119,7 @@ void VPScalarIVStepsRecipe::execute(VPTransformState 
&State) {
 
   // Ensure step has the same type as that of scalar IV.
   Type *BaseIVTy = BaseIV->getType()->getScalarType();
-  if (BaseIVTy != Step->getType()) {
-// TODO: Also use VPDerivedIVRecipe when only the step needs truncating, to
-// avoid separate truncate here.
-assert(Step->getType()->isIntegerTy() &&
-   "Truncation requires an integer step");
-Step = State.Builder.CreateTrunc(Step, BaseIVTy);
-  }
+  assert(BaseIVTy == Step->getType());
 
   // We build scalar steps for both integer and floating-point induction
   // variables. Here, we determine the kind of arithmetic we will perform.
@@ -1469,6 +1465,45 @@ void VPReplicateRecipe::print(raw_ostream &O, const 
Twine &Indent,
 }
 #endif
 
+Value *VPUniformPerUFRecipe ::generate(VPTransformState 

[Lldb-commits] [libcxxabi] [lld] [lldb] [flang] [libc] [llvm] [compiler-rt] [libclc] [clang] [libcxx] [clang-tools-extra] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -491,19 +491,41 @@ void VPlanTransforms::removeDeadRecipes(VPlan &Plan) {
 
 static VPValue *createScalarIVSteps(VPlan &Plan, const InductionDescriptor &ID,
 ScalarEvolution &SE, Instruction *TruncI,
-Type *IVTy, VPValue *StartV,
-VPValue *Step) {
+VPValue *StartV, VPValue *Step) {
   VPBasicBlock *HeaderVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
   auto IP = HeaderVPBB->getFirstNonPhi();
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
-  Type *TruncTy = TruncI ? TruncI->getType() : IVTy;
   VPValue *BaseIV = CanonicalIV;
-  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step, TruncTy)) {
-BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step,
-   TruncI ? TruncI->getType() : nullptr);
+  VPTypeAnalysis TypeInfo(SE.getContext());
+  Type *StepTy = TypeInfo.inferScalarType(Step);
+  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step)) {
+BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step);
 HeaderVPBB->insert(BaseIV->getDefiningRecipe(), IP);
   }
 
+  // Truncate base induction if needed.
+  if (TruncI) {
+Type *TruncTy = TruncI->getType();
+assert(TypeInfo.inferScalarType(BaseIV)->getScalarSizeInBits() >
+   TruncTy->getScalarSizeInBits() &&
+   StepTy->isIntegerTy() && "Truncation requires an integer step");
+auto *T = new VPScalarCastRecipe(Instruction::Trunc, BaseIV, TruncTy);
+HeaderVPBB->insert(T, IP);
+BaseIV = T;
+  }
+
+  // Truncate step if needed.
+  Type *BaseIVTy = TypeInfo.inferScalarType(BaseIV);

fhahn wrote:

Moved, thanks!

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lld] [libc] [lldb] [compiler-rt] [libcxx] [flang] [libclc] [libcxxabi] [clang-tools-extra] [clang] [llvm] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits

https://github.com/fhahn commented:

Address latest comments, thanks!

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libcxxabi] [lldb] [clang-tools-extra] [libc] [flang] [llvm] [compiler-rt] [libclc] [libcxx] [lld] [clang] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -491,19 +491,41 @@ void VPlanTransforms::removeDeadRecipes(VPlan &Plan) {
 
 static VPValue *createScalarIVSteps(VPlan &Plan, const InductionDescriptor &ID,
 ScalarEvolution &SE, Instruction *TruncI,
-Type *IVTy, VPValue *StartV,
-VPValue *Step) {
+VPValue *StartV, VPValue *Step) {
   VPBasicBlock *HeaderVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
   auto IP = HeaderVPBB->getFirstNonPhi();
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
-  Type *TruncTy = TruncI ? TruncI->getType() : IVTy;
   VPValue *BaseIV = CanonicalIV;
-  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step, TruncTy)) {
-BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step,
-   TruncI ? TruncI->getType() : nullptr);
+  VPTypeAnalysis TypeInfo(SE.getContext());
+  Type *StepTy = TypeInfo.inferScalarType(Step);
+  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step)) {
+BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step);
 HeaderVPBB->insert(BaseIV->getDefiningRecipe(), IP);
   }
 
+  // Truncate base induction if needed.
+  if (TruncI) {
+Type *TruncTy = TruncI->getType();
+assert(TypeInfo.inferScalarType(BaseIV)->getScalarSizeInBits() >
+   TruncTy->getScalarSizeInBits() &&
+   StepTy->isIntegerTy() && "Truncation requires an integer step");
+auto *T = new VPScalarCastRecipe(Instruction::Trunc, BaseIV, TruncTy);
+HeaderVPBB->insert(T, IP);
+BaseIV = T;
+  }
+
+  // Truncate step if needed.
+  Type *BaseIVTy = TypeInfo.inferScalarType(BaseIV);
+  if (BaseIVTy != StepTy) {
+assert(StepTy->getScalarSizeInBits() > BaseIVTy->getScalarSizeInBits() &&
+   "Not truncating.");
+
+Step = new VPScalarCastRecipe(Instruction::Trunc, Step, BaseIVTy);

fhahn wrote:

Renamed, thanks!

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [llvm] [clang-tools-extra] [libc] [lldb] [compiler-rt] [libclc] [libcxx] [libcxxabi] [clang] [flang] [lld] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -491,19 +491,41 @@ void VPlanTransforms::removeDeadRecipes(VPlan &Plan) {
 
 static VPValue *createScalarIVSteps(VPlan &Plan, const InductionDescriptor &ID,
 ScalarEvolution &SE, Instruction *TruncI,
-Type *IVTy, VPValue *StartV,
-VPValue *Step) {
+VPValue *StartV, VPValue *Step) {
   VPBasicBlock *HeaderVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
   auto IP = HeaderVPBB->getFirstNonPhi();
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
-  Type *TruncTy = TruncI ? TruncI->getType() : IVTy;
   VPValue *BaseIV = CanonicalIV;
-  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step, TruncTy)) {
-BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step,
-   TruncI ? TruncI->getType() : nullptr);
+  VPTypeAnalysis TypeInfo(SE.getContext());
+  Type *StepTy = TypeInfo.inferScalarType(Step);

fhahn wrote:

sunk, thanks!

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [llvm] [libc] [lld] [libcxxabi] [clang-tools-extra] [clang] [flang] [compiler-rt] [libclc] [libcxx] [lldb] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [flang] [llvm] [clang-tools-extra] [libclc] [compiler-rt] [libcxxabi] [clang] [lldb] [lld] [libcxx] [libc] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -859,6 +859,7 @@ class VPSingleDefRecipe : public VPRecipeBase, public 
VPValue {
 case VPRecipeBase::VPWidenIntOrFpInductionSC:
 case VPRecipeBase::VPWidenPointerInductionSC:
 case VPRecipeBase::VPReductionPHISC:
+case VPRecipeBase::VPScalarCastSC:

fhahn wrote:

Yep, will do separately.

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libcxxabi] [lldb] [clang-tools-extra] [llvm] [libcxx] [libc] [libclc] [lld] [compiler-rt] [flang] [clang] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -491,19 +491,41 @@ void VPlanTransforms::removeDeadRecipes(VPlan &Plan) {
 
 static VPValue *createScalarIVSteps(VPlan &Plan, const InductionDescriptor &ID,
 ScalarEvolution &SE, Instruction *TruncI,
-Type *IVTy, VPValue *StartV,
-VPValue *Step) {
+VPValue *StartV, VPValue *Step) {
   VPBasicBlock *HeaderVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
   auto IP = HeaderVPBB->getFirstNonPhi();
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
-  Type *TruncTy = TruncI ? TruncI->getType() : IVTy;
   VPValue *BaseIV = CanonicalIV;

fhahn wrote:

Done, thanks!

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [clang-tools-extra] [compiler-rt] [llvm] [libcxxabi] [libclc] [libc] [lld] [clang] [libcxx] [flang] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -1469,6 +1461,52 @@ void VPReplicateRecipe::print(raw_ostream &O, const 
Twine &Indent,
 }
 #endif
 
+static bool isUniformAcrossVFsAndUFs(VPScalarCastRecipe *C) {
+  return C->isDefinedOutsideVectorRegions() ||
+ isa(C->getOperand(0)) ||
+ isa(C->getOperand(0));
+}
+
+Value *VPScalarCastRecipe ::generate(VPTransformState &State, unsigned Part) {
+  assert(vputils::onlyFirstLaneUsed(this) &&
+ "Codegen only implemented for first lane.");
+  switch (Opcode) {
+  case Instruction::SExt:
+  case Instruction::ZExt:

fhahn wrote:

Added a note, thanks!

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [flang] [llvm] [compiler-rt] [libcxx] [libclc] [libcxxabi] [lldb] [clang-tools-extra] [libc] [lld] [clang] [VPlan] Add new VPScalarCastRecipe, use for IV & step trunc. (PR #78113)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -491,19 +491,41 @@ void VPlanTransforms::removeDeadRecipes(VPlan &Plan) {
 
 static VPValue *createScalarIVSteps(VPlan &Plan, const InductionDescriptor &ID,
 ScalarEvolution &SE, Instruction *TruncI,
-Type *IVTy, VPValue *StartV,
-VPValue *Step) {
+VPValue *StartV, VPValue *Step) {
   VPBasicBlock *HeaderVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
   auto IP = HeaderVPBB->getFirstNonPhi();
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
-  Type *TruncTy = TruncI ? TruncI->getType() : IVTy;
   VPValue *BaseIV = CanonicalIV;
-  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step, TruncTy)) {
-BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step,
-   TruncI ? TruncI->getType() : nullptr);
+  VPTypeAnalysis TypeInfo(SE.getContext());
+  Type *StepTy = TypeInfo.inferScalarType(Step);
+  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step)) {
+BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step);
 HeaderVPBB->insert(BaseIV->getDefiningRecipe(), IP);
   }
 
+  // Truncate base induction if needed.
+  if (TruncI) {
+Type *TruncTy = TruncI->getType();
+assert(TypeInfo.inferScalarType(BaseIV)->getScalarSizeInBits() >
+   TruncTy->getScalarSizeInBits() &&
+   StepTy->isIntegerTy() && "Truncation requires an integer step");
+auto *T = new VPScalarCastRecipe(Instruction::Trunc, BaseIV, TruncTy);
+HeaderVPBB->insert(T, IP);
+BaseIV = T;
+  }
+
+  // Truncate step if needed.
+  Type *BaseIVTy = TypeInfo.inferScalarType(BaseIV);
+  if (BaseIVTy != StepTy) {
+assert(StepTy->getScalarSizeInBits() > BaseIVTy->getScalarSizeInBits() &&
+   "Not truncating.");

fhahn wrote:

Renamed thanks!

https://github.com/llvm/llvm-project/pull/78113
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [lldb][NFCI] Remove unused method BreakpointIDList::FindBreakpointID(const char *, size_t *) (PR #79215)

2024-01-25 Thread via lldb-commits

jimingham wrote:

LGTM

https://github.com/llvm/llvm-project/pull/79215
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libc] [llvm] [compiler-rt] [libcxx] [lld] [lldb] [flang] [clang-tools-extra] [libcxxabi] [clang] [NVPTX] Improve lowering of v4i8 (PR #67866)

2024-01-25 Thread Justin Fargnoli via lldb-commits

https://github.com/justinfargnoli edited 
https://github.com/llvm/llvm-project/pull/67866
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] 89dc706 - [lldb] Fix printf format errors

2024-01-25 Thread Kazu Hirata via lldb-commits

Author: Kazu Hirata
Date: 2024-01-25T10:39:24-08:00
New Revision: 89dc7063f6c81d468a61b71b4ca612e22cb87a46

URL: 
https://github.com/llvm/llvm-project/commit/89dc7063f6c81d468a61b71b4ca612e22cb87a46
DIFF: 
https://github.com/llvm/llvm-project/commit/89dc7063f6c81d468a61b71b4ca612e22cb87a46.diff

LOG: [lldb] Fix printf format errors

This patch fixes:

  lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp:1108:39: error:
  format specifies type 'long long' but the argument has type
  'std::time_t' (aka 'long') [-Werror,-Wformat]

  lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp:1116:64: error:
  format specifies type 'long long' but the argument has type
  'std::time_t' (aka 'long') [-Werror,-Wformat]

Added: 


Modified: 
lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp

Removed: 




diff  --git a/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp 
b/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp
index c5bed2cee815078..d0bdbe1fd4d91ac 100644
--- a/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp
+++ b/lldb/source/Plugins/Language/CPlusPlus/LibCxx.cpp
@@ -1105,7 +1105,7 @@ bool 
lldb_private::formatters::LibcxxChronoSysSecondsSummaryProvider(
 
   const std::time_t seconds = ptr_sp->GetValueAsSigned(0);
   if (seconds < chrono_timestamp_min || seconds > chrono_timestamp_max)
-stream.Printf("timestamp=%lld s", seconds);
+stream.Printf("timestamp=%" PRIu64 " s", static_cast(seconds));
   else {
 std::array str;
 std::size_t size =
@@ -1113,7 +1113,8 @@ bool 
lldb_private::formatters::LibcxxChronoSysSecondsSummaryProvider(
 if (size == 0)
   return false;
 
-stream.Printf("date/time=%s timestamp=%lld s", str.data(), seconds);
+stream.Printf("date/time=%s timestamp=%" PRIu64 " s", str.data(),
+  static_cast(seconds));
   }
 
   return true;



___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lld] [clang] [clang-tools-extra] [libunwind] [compiler-rt] [libc] [lldb] [llvm] [libcxx] [mlir] [flang] [ELF] Implement R_RISCV_TLSDESC for RISC-V (PR #79239)

2024-01-25 Thread Fangrui Song via lldb-commits

https://github.com/MaskRay edited 
https://github.com/llvm/llvm-project/pull/79239
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [compiler-rt] [flang] [libcxx] [libunwind] [clang] [llvm] [clang-tools-extra] [lldb] [mlir] [libc] [lld] [ELF] Implement R_RISCV_TLSDESC for RISC-V (PR #79239)

2024-01-25 Thread Fangrui Song via lldb-commits

https://github.com/MaskRay updated 
https://github.com/llvm/llvm-project/pull/79239

>From 3725fa4eac3d3d946289d7eb7213f3a1751a2770 Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Tue, 23 Jan 2024 17:58:07 -0800
Subject: [PATCH] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20initia?=
 =?UTF-8?q?l=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.4
---
 lld/ELF/Arch/RISCV.cpp| 158 +
 lld/ELF/Relocations.cpp   |  25 ++--
 lld/test/ELF/riscv-tlsdesc-gd-mixed.s |  26 
 lld/test/ELF/riscv-tlsdesc-relax.s| 125 +
 lld/test/ELF/riscv-tlsdesc.s  | 192 ++
 5 files changed, 492 insertions(+), 34 deletions(-)
 create mode 100644 lld/test/ELF/riscv-tlsdesc-gd-mixed.s
 create mode 100644 lld/test/ELF/riscv-tlsdesc-relax.s
 create mode 100644 lld/test/ELF/riscv-tlsdesc.s

diff --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp
index d7d3d3e47814971..67d7e2562e9b178 100644
--- a/lld/ELF/Arch/RISCV.cpp
+++ b/lld/ELF/Arch/RISCV.cpp
@@ -61,6 +61,7 @@ enum Op {
   AUIPC = 0x17,
   JALR = 0x67,
   LD = 0x3003,
+  LUI = 0x37,
   LW = 0x2003,
   SRLI = 0x5013,
   SUB = 0x4033,
@@ -73,6 +74,7 @@ enum Reg {
   X_T0 = 5,
   X_T1 = 6,
   X_T2 = 7,
+  X_A0 = 10,
   X_T3 = 28,
 };
 
@@ -102,6 +104,26 @@ static uint32_t setLO12_S(uint32_t insn, uint32_t imm) {
  (extractBits(imm, 4, 0) << 7);
 }
 
+namespace {
+struct SymbolAnchor {
+  uint64_t offset;
+  Defined *d;
+  bool end; // true for the anchor of st_value+st_size
+};
+} // namespace
+
+struct elf::RISCVRelaxAux {
+  // This records symbol start and end offsets which will be adjusted according
+  // to the nearest relocDeltas element.
+  SmallVector anchors;
+  // For relocations[i], the actual offset is r_offset - (i ? relocDeltas[i-1] 
:
+  // 0).
+  std::unique_ptr relocDeltas;
+  // For relocations[i], the actual type is relocTypes[i].
+  std::unique_ptr relocTypes;
+  SmallVector writes;
+};
+
 RISCV::RISCV() {
   copyRel = R_RISCV_COPY;
   pltRel = R_RISCV_JUMP_SLOT;
@@ -119,6 +141,7 @@ RISCV::RISCV() {
 tlsGotRel = R_RISCV_TLS_TPREL32;
   }
   gotRel = symbolicRel;
+  tlsDescRel = R_RISCV_TLSDESC;
 
   // .got[0] = _DYNAMIC
   gotHeaderEntriesNum = 1;
@@ -187,6 +210,8 @@ int64_t RISCV::getImplicitAddend(const uint8_t *buf, 
RelType type) const {
   case R_RISCV_JUMP_SLOT:
 // These relocations are defined as not having an implicit addend.
 return 0;
+  case R_RISCV_TLSDESC:
+return config->is64 ? read64le(buf + 8) : read32le(buf + 4);
   }
 }
 
@@ -295,6 +320,12 @@ RelExpr RISCV::getRelExpr(const RelType type, const Symbol 
&s,
   case R_RISCV_PCREL_LO12_I:
   case R_RISCV_PCREL_LO12_S:
 return R_RISCV_PC_INDIRECT;
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+  case R_RISCV_TLSDESC_ADD_LO12:
+return R_TLSDESC_PC;
+  case R_RISCV_TLSDESC_CALL:
+return R_TLSDESC_CALL;
   case R_RISCV_TLS_GD_HI20:
 return R_TLSGD_PC;
   case R_RISCV_TLS_GOT_HI20:
@@ -419,6 +450,7 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, 
uint64_t val) const {
 
   case R_RISCV_GOT_HI20:
   case R_RISCV_PCREL_HI20:
+  case R_RISCV_TLSDESC_HI20:
   case R_RISCV_TLS_GD_HI20:
   case R_RISCV_TLS_GOT_HI20:
   case R_RISCV_TPREL_HI20:
@@ -430,6 +462,8 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, 
uint64_t val) const {
   }
 
   case R_RISCV_PCREL_LO12_I:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+  case R_RISCV_TLSDESC_ADD_LO12:
   case R_RISCV_TPREL_LO12_I:
   case R_RISCV_LO12_I: {
 uint64_t hi = (val + 0x800) >> 12;
@@ -513,29 +547,113 @@ void RISCV::relocate(uint8_t *loc, const Relocation 
&rel, uint64_t val) const {
 break;
 
   case R_RISCV_RELAX:
-return; // Ignored (for now)
-
+return;
+  case R_RISCV_TLSDESC:
+// The addend is stored in the second word.
+if (config->is64)
+  write64le(loc + 8, val);
+else
+  write32le(loc + 4, val);
+break;
   default:
 llvm_unreachable("unknown relocation");
   }
 }
 
+static void tlsdescToIe(uint8_t *loc, const Relocation &rel, uint64_t val) {
+  switch (rel.type) {
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+write32le(loc, 0x0013); // nop
+return;
+  case R_RISCV_TLSDESC_ADD_LO12:
+write32le(loc, utype(AUIPC, X_A0, hi20(val))); // auipc a0,
+return;
+  case R_RISCV_TLSDESC_CALL:
+if (config->is64)
+  write32le(loc, itype(LD, X_A0, X_A0, lo12(val))); // ld a0,(a0)
+else
+  write32le(loc, itype(LW, X_A0, X_A0, lo12(val))); // lw a0,(a0)
+return;
+  default:
+llvm_unreachable("unsupported relocation for TLSDESC to IE relaxation");
+  }
+}
+
+static void tlsdescToLe(uint8_t *loc, const Relocation &rel, uint64_t val) {
+  switch (rel.type) {
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+write32le(loc, 0x0013); // nop
+return;
+  case R_RISCV_TLSDESC_ADD

[Lldb-commits] [compiler-rt] [flang] [libcxx] [libunwind] [clang] [llvm] [clang-tools-extra] [lldb] [mlir] [libc] [lld] [ELF] Implement R_RISCV_TLSDESC for RISC-V (PR #79239)

2024-01-25 Thread Fangrui Song via lldb-commits

https://github.com/MaskRay updated 
https://github.com/llvm/llvm-project/pull/79239

>From 3725fa4eac3d3d946289d7eb7213f3a1751a2770 Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Tue, 23 Jan 2024 17:58:07 -0800
Subject: [PATCH 1/2] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.4
---
 lld/ELF/Arch/RISCV.cpp| 158 +
 lld/ELF/Relocations.cpp   |  25 ++--
 lld/test/ELF/riscv-tlsdesc-gd-mixed.s |  26 
 lld/test/ELF/riscv-tlsdesc-relax.s| 125 +
 lld/test/ELF/riscv-tlsdesc.s  | 192 ++
 5 files changed, 492 insertions(+), 34 deletions(-)
 create mode 100644 lld/test/ELF/riscv-tlsdesc-gd-mixed.s
 create mode 100644 lld/test/ELF/riscv-tlsdesc-relax.s
 create mode 100644 lld/test/ELF/riscv-tlsdesc.s

diff --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp
index d7d3d3e47814971..67d7e2562e9b178 100644
--- a/lld/ELF/Arch/RISCV.cpp
+++ b/lld/ELF/Arch/RISCV.cpp
@@ -61,6 +61,7 @@ enum Op {
   AUIPC = 0x17,
   JALR = 0x67,
   LD = 0x3003,
+  LUI = 0x37,
   LW = 0x2003,
   SRLI = 0x5013,
   SUB = 0x4033,
@@ -73,6 +74,7 @@ enum Reg {
   X_T0 = 5,
   X_T1 = 6,
   X_T2 = 7,
+  X_A0 = 10,
   X_T3 = 28,
 };
 
@@ -102,6 +104,26 @@ static uint32_t setLO12_S(uint32_t insn, uint32_t imm) {
  (extractBits(imm, 4, 0) << 7);
 }
 
+namespace {
+struct SymbolAnchor {
+  uint64_t offset;
+  Defined *d;
+  bool end; // true for the anchor of st_value+st_size
+};
+} // namespace
+
+struct elf::RISCVRelaxAux {
+  // This records symbol start and end offsets which will be adjusted according
+  // to the nearest relocDeltas element.
+  SmallVector anchors;
+  // For relocations[i], the actual offset is r_offset - (i ? relocDeltas[i-1] 
:
+  // 0).
+  std::unique_ptr relocDeltas;
+  // For relocations[i], the actual type is relocTypes[i].
+  std::unique_ptr relocTypes;
+  SmallVector writes;
+};
+
 RISCV::RISCV() {
   copyRel = R_RISCV_COPY;
   pltRel = R_RISCV_JUMP_SLOT;
@@ -119,6 +141,7 @@ RISCV::RISCV() {
 tlsGotRel = R_RISCV_TLS_TPREL32;
   }
   gotRel = symbolicRel;
+  tlsDescRel = R_RISCV_TLSDESC;
 
   // .got[0] = _DYNAMIC
   gotHeaderEntriesNum = 1;
@@ -187,6 +210,8 @@ int64_t RISCV::getImplicitAddend(const uint8_t *buf, 
RelType type) const {
   case R_RISCV_JUMP_SLOT:
 // These relocations are defined as not having an implicit addend.
 return 0;
+  case R_RISCV_TLSDESC:
+return config->is64 ? read64le(buf + 8) : read32le(buf + 4);
   }
 }
 
@@ -295,6 +320,12 @@ RelExpr RISCV::getRelExpr(const RelType type, const Symbol 
&s,
   case R_RISCV_PCREL_LO12_I:
   case R_RISCV_PCREL_LO12_S:
 return R_RISCV_PC_INDIRECT;
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+  case R_RISCV_TLSDESC_ADD_LO12:
+return R_TLSDESC_PC;
+  case R_RISCV_TLSDESC_CALL:
+return R_TLSDESC_CALL;
   case R_RISCV_TLS_GD_HI20:
 return R_TLSGD_PC;
   case R_RISCV_TLS_GOT_HI20:
@@ -419,6 +450,7 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, 
uint64_t val) const {
 
   case R_RISCV_GOT_HI20:
   case R_RISCV_PCREL_HI20:
+  case R_RISCV_TLSDESC_HI20:
   case R_RISCV_TLS_GD_HI20:
   case R_RISCV_TLS_GOT_HI20:
   case R_RISCV_TPREL_HI20:
@@ -430,6 +462,8 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, 
uint64_t val) const {
   }
 
   case R_RISCV_PCREL_LO12_I:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+  case R_RISCV_TLSDESC_ADD_LO12:
   case R_RISCV_TPREL_LO12_I:
   case R_RISCV_LO12_I: {
 uint64_t hi = (val + 0x800) >> 12;
@@ -513,29 +547,113 @@ void RISCV::relocate(uint8_t *loc, const Relocation 
&rel, uint64_t val) const {
 break;
 
   case R_RISCV_RELAX:
-return; // Ignored (for now)
-
+return;
+  case R_RISCV_TLSDESC:
+// The addend is stored in the second word.
+if (config->is64)
+  write64le(loc + 8, val);
+else
+  write32le(loc + 4, val);
+break;
   default:
 llvm_unreachable("unknown relocation");
   }
 }
 
+static void tlsdescToIe(uint8_t *loc, const Relocation &rel, uint64_t val) {
+  switch (rel.type) {
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+write32le(loc, 0x0013); // nop
+return;
+  case R_RISCV_TLSDESC_ADD_LO12:
+write32le(loc, utype(AUIPC, X_A0, hi20(val))); // auipc a0,
+return;
+  case R_RISCV_TLSDESC_CALL:
+if (config->is64)
+  write32le(loc, itype(LD, X_A0, X_A0, lo12(val))); // ld a0,(a0)
+else
+  write32le(loc, itype(LW, X_A0, X_A0, lo12(val))); // lw a0,(a0)
+return;
+  default:
+llvm_unreachable("unsupported relocation for TLSDESC to IE relaxation");
+  }
+}
+
+static void tlsdescToLe(uint8_t *loc, const Relocation &rel, uint64_t val) {
+  switch (rel.type) {
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+write32le(loc, 0x0013); // nop
+return;
+  case R_RISCV_TLSDESC

[Lldb-commits] [compiler-rt] [flang] [libcxx] [clang] [llvm] [clang-tools-extra] [lldb] [lld] [libc] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

https://github.com/jhuber6 updated 
https://github.com/llvm/llvm-project/pull/79373

>From 145b7bc932ce3ffa46545cd7af29b1c93981429c Mon Sep 17 00:00:00 2001
From: Joseph Huber 
Date: Wed, 24 Jan 2024 15:34:00 -0600
Subject: [PATCH 1/3] [NVPTX] Add support for -march=native in standalone NVPTX

Summary:
We support `--target=nvptx64-nvidia-cuda` as a way to target the NVPTX
architecture from standard CPU. This patch simply uses the existing
support for handling `--offload-arch=native` to also apply to the
standalone toolchain.
---
 clang/lib/Driver/ToolChains/Cuda.cpp   | 61 +-
 clang/lib/Driver/ToolChains/Cuda.h | 10 ++--
 clang/test/Driver/nvptx-cuda-system-arch.c |  5 ++
 3 files changed, 45 insertions(+), 31 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/Cuda.cpp 
b/clang/lib/Driver/ToolChains/Cuda.cpp
index 1462576ca870e6f..6215c43b5fc96bd 100644
--- a/clang/lib/Driver/ToolChains/Cuda.cpp
+++ b/clang/lib/Driver/ToolChains/Cuda.cpp
@@ -738,9 +738,18 @@ NVPTXToolChain::TranslateArgs(const 
llvm::opt::DerivedArgList &Args,
 if (!llvm::is_contained(*DAL, A))
   DAL->append(A);
 
-  if (!DAL->hasArg(options::OPT_march_EQ))
+  if (!DAL->hasArg(options::OPT_march_EQ)) {
 DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ),
   CudaArchToString(CudaArch::CudaDefault));
+  } else if (DAL->getLastArgValue(options::OPT_march_EQ) == "native") {
+auto GPUsOrErr = getSystemGPUArchs(Args);
+if (!GPUsOrErr)
+  getDriver().Diag(diag::err_drv_undetermined_gpu_arch)
+  << getArchName() << llvm::toString(GPUsOrErr.takeError()) << 
"-march";
+else
+  DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ),
+Args.MakeArgString(GPUsOrErr->front()));
+  }
 
   return DAL;
 }
@@ -783,6 +792,31 @@ void NVPTXToolChain::adjustDebugInfoKind(
   }
 }
 
+Expected>
+NVPTXToolChain::getSystemGPUArchs(const ArgList &Args) const {
+  // Detect NVIDIA GPUs availible on the system.
+  std::string Program;
+  if (Arg *A = Args.getLastArg(options::OPT_nvptx_arch_tool_EQ))
+Program = A->getValue();
+  else
+Program = GetProgramPath("nvptx-arch");
+
+  auto StdoutOrErr = executeToolChainProgram(Program);
+  if (!StdoutOrErr)
+return StdoutOrErr.takeError();
+
+  SmallVector GPUArchs;
+  for (StringRef Arch : llvm::split((*StdoutOrErr)->getBuffer(), "\n"))
+if (!Arch.empty())
+  GPUArchs.push_back(Arch.str());
+
+  if (GPUArchs.empty())
+return llvm::createStringError(std::error_code(),
+   "No NVIDIA GPU detected in the system");
+
+  return std::move(GPUArchs);
+}
+
 /// CUDA toolchain.  Our assembler is ptxas, and our "linker" is fatbinary,
 /// which isn't properly a linker but nonetheless performs the step of 
stitching
 /// together object files from the assembler into a single blob.
@@ -948,31 +982,6 @@ CudaToolChain::TranslateArgs(const 
llvm::opt::DerivedArgList &Args,
   return DAL;
 }
 
-Expected>
-CudaToolChain::getSystemGPUArchs(const ArgList &Args) const {
-  // Detect NVIDIA GPUs availible on the system.
-  std::string Program;
-  if (Arg *A = Args.getLastArg(options::OPT_nvptx_arch_tool_EQ))
-Program = A->getValue();
-  else
-Program = GetProgramPath("nvptx-arch");
-
-  auto StdoutOrErr = executeToolChainProgram(Program);
-  if (!StdoutOrErr)
-return StdoutOrErr.takeError();
-
-  SmallVector GPUArchs;
-  for (StringRef Arch : llvm::split((*StdoutOrErr)->getBuffer(), "\n"))
-if (!Arch.empty())
-  GPUArchs.push_back(Arch.str());
-
-  if (GPUArchs.empty())
-return llvm::createStringError(std::error_code(),
-   "No NVIDIA GPU detected in the system");
-
-  return std::move(GPUArchs);
-}
-
 Tool *NVPTXToolChain::buildAssembler() const {
   return new tools::NVPTX::Assembler(*this);
 }
diff --git a/clang/lib/Driver/ToolChains/Cuda.h 
b/clang/lib/Driver/ToolChains/Cuda.h
index 8a053f3393e1206..43c17ba7c0ba03d 100644
--- a/clang/lib/Driver/ToolChains/Cuda.h
+++ b/clang/lib/Driver/ToolChains/Cuda.h
@@ -168,6 +168,11 @@ class LLVM_LIBRARY_VISIBILITY NVPTXToolChain : public 
ToolChain {
   unsigned GetDefaultDwarfVersion() const override { return 2; }
   unsigned getMaxDwarfVersion() const override { return 2; }
 
+  /// Uses nvptx-arch tool to get arch of the system GPU. Will return error
+  /// if unable to find one.
+  virtual Expected>
+  getSystemGPUArchs(const llvm::opt::ArgList &Args) const override;
+
   CudaInstallationDetector CudaInstallation;
 
 protected:
@@ -223,11 +228,6 @@ class LLVM_LIBRARY_VISIBILITY CudaToolChain : public 
NVPTXToolChain {
 
   const ToolChain &HostTC;
 
-  /// Uses nvptx-arch tool to get arch of the system GPU. Will return error
-  /// if unable to find one.
-  virtual Expected>
-  getSystemGPUArchs(const llvm::opt::ArgList &Args) const override;
-
 protected:
   Tool *buildAssembler() const override; // ptxas
   Tool *buildLinker() const override;  

[Lldb-commits] [lld] [lldb] [llvm] [compiler-rt] [clang-tools-extra] [libc] [clang] [flang] [libcxx] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> On the other hand, I'd be OK with providing --offload-arch=native translating 
> into "compile for all present GPU variants", with a possibility to further 
> adjust the selected set with the usual --no-offload-arch-foo, if the user 
> needs to. This will at least produce code that will run on the machine where 
> it's built, be somewhat consistent and is still adjustable by the user when 
> the default choice will inevitably be wrong.

This is what we already do, but this is somewhat tangential. I've updated this 
patch to present the warning in the case of multiply GPUs being detected, so I 
don't think there's a concern here with the user being confused. If they have 
two GPUs, the warning will tell them which one it's using with the correct 
`sm_` value to specify it manually if they so wish. If there is only one GPU on 
the system, it should be obvious that it's going to be targeted.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] 59a6525 - [lldb][NFCI] Remove unused method BreakpointIDList::FindBreakpointID(const char *, size_t *) (#79215)

2024-01-25 Thread via lldb-commits

Author: Alex Langford
Date: 2024-01-25T11:14:53-08:00
New Revision: 59a6525a4b9d46b931021f727b3235415bc82ea5

URL: 
https://github.com/llvm/llvm-project/commit/59a6525a4b9d46b931021f727b3235415bc82ea5
DIFF: 
https://github.com/llvm/llvm-project/commit/59a6525a4b9d46b931021f727b3235415bc82ea5.diff

LOG: [lldb][NFCI] Remove unused method BreakpointIDList::FindBreakpointID(const 
char *, size_t *) (#79215)

Added: 


Modified: 
lldb/include/lldb/Breakpoint/BreakpointIDList.h
lldb/source/Breakpoint/BreakpointIDList.cpp

Removed: 




diff  --git a/lldb/include/lldb/Breakpoint/BreakpointIDList.h 
b/lldb/include/lldb/Breakpoint/BreakpointIDList.h
index 6c57d9bc507952..ddf85dd78cf2e0 100644
--- a/lldb/include/lldb/Breakpoint/BreakpointIDList.h
+++ b/lldb/include/lldb/Breakpoint/BreakpointIDList.h
@@ -45,8 +45,6 @@ class BreakpointIDList {
   // TODO: This should take a const BreakpointID.
   bool FindBreakpointID(BreakpointID &bp_id, size_t *position) const;
 
-  bool FindBreakpointID(const char *bp_id, size_t *position) const;
-
   // Returns a pair consisting of the beginning and end of a breakpoint
   // ID range expression.  If the input string is not a valid specification,
   // returns an empty pair.

diff  --git a/lldb/source/Breakpoint/BreakpointIDList.cpp 
b/lldb/source/Breakpoint/BreakpointIDList.cpp
index 5ab2c9a8dc3865..5904647314bc0c 100644
--- a/lldb/source/Breakpoint/BreakpointIDList.cpp
+++ b/lldb/source/Breakpoint/BreakpointIDList.cpp
@@ -62,15 +62,6 @@ bool BreakpointIDList::FindBreakpointID(BreakpointID &bp_id,
   return false;
 }
 
-bool BreakpointIDList::FindBreakpointID(const char *bp_id_str,
-size_t *position) const {
-  auto bp_id = BreakpointID::ParseCanonicalReference(bp_id_str);
-  if (!bp_id)
-return false;
-
-  return FindBreakpointID(*bp_id, position);
-}
-
 //  This function takes OLD_ARGS, which is usually the result of breaking the
 //  command string arguments into
 //  an array of space-separated strings, and searches through the arguments for



___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [lldb][NFCI] Remove unused method BreakpointIDList::FindBreakpointID(const char *, size_t *) (PR #79215)

2024-01-25 Thread Alex Langford via lldb-commits

https://github.com/bulbazord closed 
https://github.com/llvm/llvm-project/pull/79215
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [compiler-rt] [flang] [libcxx] [libunwind] [clang] [llvm] [clang-tools-extra] [lldb] [mlir] [libc] [lld] [ELF] Implement R_RISCV_TLSDESC for RISC-V (PR #79239)

2024-01-25 Thread Fangrui Song via lldb-commits


@@ -513,29 +547,125 @@ void RISCV::relocate(uint8_t *loc, const Relocation 
&rel, uint64_t val) const {
 break;
 
   case R_RISCV_RELAX:
-return; // Ignored (for now)
-
+return;
+  case R_RISCV_TLSDESC:
+// The addend is stored in the second word.
+if (config->is64)
+  write64le(loc + 8, val);
+else
+  write32le(loc + 4, val);
+break;
   default:
 llvm_unreachable("unknown relocation");
   }
 }
 
+static void tlsdescToIe(uint8_t *loc, const Relocation &rel, uint64_t val) {
+  switch (rel.type) {
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+write32le(loc, 0x0013); // nop
+break;
+  case R_RISCV_TLSDESC_ADD_LO12:
+write32le(loc, utype(AUIPC, X_A0, hi20(val))); // auipc a0,
+break;
+  case R_RISCV_TLSDESC_CALL:
+if (config->is64)
+  write32le(loc, itype(LD, X_A0, X_A0, lo12(val))); // ld a0,(a0)
+else
+  write32le(loc, itype(LW, X_A0, X_A0, lo12(val))); // lw a0,(a0)
+break;
+  default:
+llvm_unreachable("unsupported relocation for TLSDESC to IE");
+  }
+}
+
+static void tlsdescToLe(uint8_t *loc, const Relocation &rel, uint64_t val) {
+  switch (rel.type) {
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+write32le(loc, 0x0013); // nop
+return;
+  case R_RISCV_TLSDESC_ADD_LO12:
+if (isInt<12>(val))
+  write32le(loc, 0x0013); // nop
+else
+  write32le(loc, utype(LUI, X_A0, hi20(val))); // lui a0,
+return;
+  case R_RISCV_TLSDESC_CALL:
+if (isInt<12>(val))
+  write32le(loc, itype(ADDI, X_A0, 0, val)); // addi a0,zero,
+else
+  write32le(loc, itype(ADDI, X_A0, X_A0, lo12(val))); // addi a0,a0,
+return;
+  default:
+llvm_unreachable("unsupported relocation for TLSDESC to LE");
+  }
+}
+
 void RISCV::relocateAlloc(InputSectionBase &sec, uint8_t *buf) const {
   uint64_t secAddr = sec.getOutputSection()->addr;
   if (auto *s = dyn_cast(&sec))
 secAddr += s->outSecOff;
   else if (auto *ehIn = dyn_cast(&sec))
 secAddr += ehIn->getParent()->outSecOff;
-  for (size_t i = 0, size = sec.relocs().size(); i != size; ++i) {
-const Relocation &rel = sec.relocs()[i];
+  uint64_t tlsdescVal = 0;
+  bool isToLe = false;
+  const ArrayRef relocs = sec.relocs();
+  for (size_t i = 0, size = relocs.size(); i != size; ++i) {
+const Relocation &rel = relocs[i];
 uint8_t *loc = buf + rel.offset;
-const uint64_t val =
+uint64_t val =
 sec.getRelocTargetVA(sec.file, rel.type, rel.addend,
  secAddr + rel.offset, *rel.sym, rel.expr);
 
 switch (rel.expr) {
 case R_RELAX_HINT:
+  continue;
+case R_TLSDESC_PC:
+  // For R_RISCV_TLSDESC_HI20, store &got(sym)-PC to be used by the
+  // following two instructions L[DW] and ADDI.
+  if (rel.type == R_RISCV_TLSDESC_HI20)
+tlsdescVal = val;
+  else
+val = tlsdescVal;
   break;
+case R_RELAX_TLS_GD_TO_IE:
+  // Only R_RISCV_TLSDESC_HI20 reaches here. tlsdescVal will be finalized
+  // after we see R_RISCV_TLSDESC_ADD_LO12 in the R_RELAX_TLS_GD_TO_LE 
case.
+  // The net effect is that tlsdescVal will be smaller than `val` to take
+  // into account of NOP instructions (in the absence of R_RISCV_RELAX)
+  // before AUIPC.
+  tlsdescVal = val + rel.offset;
+  isToLe = false;
+  if (!(i + 1 != relocs.size() && relocs[i + 1].type == R_RISCV_RELAX))
+tlsdescToIe(loc, rel, val);
+  continue;
+case R_RELAX_TLS_GD_TO_LE:
+  // See the comment in handleTlsRelocation. For TLSDESC=>IE,
+  // R_RISCV_TLSDESC_{LOAD_LO12,ADD_LO12,CALL} also reach here. If isToIe 
is
+  // true, this is actually TLSDESC=>IE optimization.
+  if (rel.type == R_RISCV_TLSDESC_HI20) {
+tlsdescVal = val;
+isToLe = true;
+  } else {
+if (!isToLe && rel.type == R_RISCV_TLSDESC_ADD_LO12)
+  tlsdescVal -= rel.offset;
+val = tlsdescVal;
+  }
+  // When NOP conversion is eligible and R_RISCV_RELAX is present, don't
+  // write a NOP in case an unrelated instruction follows the current
+  // instruction.
+  if ((rel.type == R_RISCV_TLSDESC_HI20 ||
+   rel.type == R_RISCV_TLSDESC_LOAD_LO12 ||
+   (rel.type == R_RISCV_TLSDESC_ADD_LO12 && isToLe && !hi20(val))) &&
+  i + 1 != relocs.size() && relocs[i + 1].type == R_RISCV_RELAX)

MaskRay wrote:

> That sounds like a nice approach.

Thanks. Adopted this approach:)

> Noted. Should we bring this up with the psABI?

Filed https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/421

https://github.com/llvm/llvm-project/pull/79239
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [compiler-rt] [flang] [libcxx] [libunwind] [clang] [llvm] [clang-tools-extra] [lldb] [mlir] [libc] [lld] [ELF] Implement R_RISCV_TLSDESC for RISC-V (PR #79239)

2024-01-25 Thread Fangrui Song via lldb-commits

https://github.com/MaskRay updated 
https://github.com/llvm/llvm-project/pull/79239

>From 3725fa4eac3d3d946289d7eb7213f3a1751a2770 Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Tue, 23 Jan 2024 17:58:07 -0800
Subject: [PATCH 1/2] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.4
---
 lld/ELF/Arch/RISCV.cpp| 158 +
 lld/ELF/Relocations.cpp   |  25 ++--
 lld/test/ELF/riscv-tlsdesc-gd-mixed.s |  26 
 lld/test/ELF/riscv-tlsdesc-relax.s| 125 +
 lld/test/ELF/riscv-tlsdesc.s  | 192 ++
 5 files changed, 492 insertions(+), 34 deletions(-)
 create mode 100644 lld/test/ELF/riscv-tlsdesc-gd-mixed.s
 create mode 100644 lld/test/ELF/riscv-tlsdesc-relax.s
 create mode 100644 lld/test/ELF/riscv-tlsdesc.s

diff --git a/lld/ELF/Arch/RISCV.cpp b/lld/ELF/Arch/RISCV.cpp
index d7d3d3e4781497..67d7e2562e9b17 100644
--- a/lld/ELF/Arch/RISCV.cpp
+++ b/lld/ELF/Arch/RISCV.cpp
@@ -61,6 +61,7 @@ enum Op {
   AUIPC = 0x17,
   JALR = 0x67,
   LD = 0x3003,
+  LUI = 0x37,
   LW = 0x2003,
   SRLI = 0x5013,
   SUB = 0x4033,
@@ -73,6 +74,7 @@ enum Reg {
   X_T0 = 5,
   X_T1 = 6,
   X_T2 = 7,
+  X_A0 = 10,
   X_T3 = 28,
 };
 
@@ -102,6 +104,26 @@ static uint32_t setLO12_S(uint32_t insn, uint32_t imm) {
  (extractBits(imm, 4, 0) << 7);
 }
 
+namespace {
+struct SymbolAnchor {
+  uint64_t offset;
+  Defined *d;
+  bool end; // true for the anchor of st_value+st_size
+};
+} // namespace
+
+struct elf::RISCVRelaxAux {
+  // This records symbol start and end offsets which will be adjusted according
+  // to the nearest relocDeltas element.
+  SmallVector anchors;
+  // For relocations[i], the actual offset is r_offset - (i ? relocDeltas[i-1] 
:
+  // 0).
+  std::unique_ptr relocDeltas;
+  // For relocations[i], the actual type is relocTypes[i].
+  std::unique_ptr relocTypes;
+  SmallVector writes;
+};
+
 RISCV::RISCV() {
   copyRel = R_RISCV_COPY;
   pltRel = R_RISCV_JUMP_SLOT;
@@ -119,6 +141,7 @@ RISCV::RISCV() {
 tlsGotRel = R_RISCV_TLS_TPREL32;
   }
   gotRel = symbolicRel;
+  tlsDescRel = R_RISCV_TLSDESC;
 
   // .got[0] = _DYNAMIC
   gotHeaderEntriesNum = 1;
@@ -187,6 +210,8 @@ int64_t RISCV::getImplicitAddend(const uint8_t *buf, 
RelType type) const {
   case R_RISCV_JUMP_SLOT:
 // These relocations are defined as not having an implicit addend.
 return 0;
+  case R_RISCV_TLSDESC:
+return config->is64 ? read64le(buf + 8) : read32le(buf + 4);
   }
 }
 
@@ -295,6 +320,12 @@ RelExpr RISCV::getRelExpr(const RelType type, const Symbol 
&s,
   case R_RISCV_PCREL_LO12_I:
   case R_RISCV_PCREL_LO12_S:
 return R_RISCV_PC_INDIRECT;
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+  case R_RISCV_TLSDESC_ADD_LO12:
+return R_TLSDESC_PC;
+  case R_RISCV_TLSDESC_CALL:
+return R_TLSDESC_CALL;
   case R_RISCV_TLS_GD_HI20:
 return R_TLSGD_PC;
   case R_RISCV_TLS_GOT_HI20:
@@ -419,6 +450,7 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, 
uint64_t val) const {
 
   case R_RISCV_GOT_HI20:
   case R_RISCV_PCREL_HI20:
+  case R_RISCV_TLSDESC_HI20:
   case R_RISCV_TLS_GD_HI20:
   case R_RISCV_TLS_GOT_HI20:
   case R_RISCV_TPREL_HI20:
@@ -430,6 +462,8 @@ void RISCV::relocate(uint8_t *loc, const Relocation &rel, 
uint64_t val) const {
   }
 
   case R_RISCV_PCREL_LO12_I:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+  case R_RISCV_TLSDESC_ADD_LO12:
   case R_RISCV_TPREL_LO12_I:
   case R_RISCV_LO12_I: {
 uint64_t hi = (val + 0x800) >> 12;
@@ -513,29 +547,113 @@ void RISCV::relocate(uint8_t *loc, const Relocation 
&rel, uint64_t val) const {
 break;
 
   case R_RISCV_RELAX:
-return; // Ignored (for now)
-
+return;
+  case R_RISCV_TLSDESC:
+// The addend is stored in the second word.
+if (config->is64)
+  write64le(loc + 8, val);
+else
+  write32le(loc + 4, val);
+break;
   default:
 llvm_unreachable("unknown relocation");
   }
 }
 
+static void tlsdescToIe(uint8_t *loc, const Relocation &rel, uint64_t val) {
+  switch (rel.type) {
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+write32le(loc, 0x0013); // nop
+return;
+  case R_RISCV_TLSDESC_ADD_LO12:
+write32le(loc, utype(AUIPC, X_A0, hi20(val))); // auipc a0,
+return;
+  case R_RISCV_TLSDESC_CALL:
+if (config->is64)
+  write32le(loc, itype(LD, X_A0, X_A0, lo12(val))); // ld a0,(a0)
+else
+  write32le(loc, itype(LW, X_A0, X_A0, lo12(val))); // lw a0,(a0)
+return;
+  default:
+llvm_unreachable("unsupported relocation for TLSDESC to IE relaxation");
+  }
+}
+
+static void tlsdescToLe(uint8_t *loc, const Relocation &rel, uint64_t val) {
+  switch (rel.type) {
+  case R_RISCV_TLSDESC_HI20:
+  case R_RISCV_TLSDESC_LOAD_LO12:
+write32le(loc, 0x0013); // nop
+return;
+  case R_RISCV_TLSDESC_A

[Lldb-commits] [clang-tools-extra] [libc] [llvm] [compiler-rt] [lld] [libcxx] [lldb] [flang] [clang] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Artem Belevich via lldb-commits

Artem-B wrote:

> This is what we already do for `--offload-arch=native` on CUDA, but this is 
> somewhat tangential. I've updated this patch to present the warning in the 
> case of multiply GPUs being detected, so I don't think there's a concern here 
> with the user being confused. If they have two GPUs, the warning will tell 
> them which one it's using with the correct `sm_` value to specify it manually 
> if they so wish. 

User confusion is only part of the issue here. With any single GPU choice we 
would still potentially produce a nonworking binary, if our GPU choice does not 
match what the user wants.

"all GPUs" has the advantage of always producing the binary that's guaranteed 
to work. Granted, in the case of multiple GPUs it comes with the compilation 
time overhead, but I think it's a better trade-off than compiling faster, but 
not working. If the overhead is unacceptable, *then* we can tweak the build, 
but in that case, the user may as well just specify the desired architectures 
explicitly.

> If there is only one GPU on the system, it should be obvious that it's going 
> to be targeted.
This case works the same with either approach.


https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [lld] [libcxx] [flang] [compiler-rt] [libc] [clang-tools-extra] [llvm] [lldb] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> User confusion is only part of the issue here. With any single GPU choice we 
> would still potentially produce a nonworking binary, if our GPU choice does 
> not match what the user wants.
>
> "all GPUs" has the advantage of always producing the binary that's guaranteed 
> to work. Granted, in the case of multiple GPUs it comes with the compilation 
> time overhead, but I think it's a better trade-off than compiling faster, but 
> not working. If the overhead is unacceptable, then we can tweak the build, 
> but in that case, the user may as well just specify the desired architectures 
> explicitly.

I think the semantics of `native` on other architectures are clear enough here. 
This combined with the fact that using `-march=native` will error out in the 
case of no GPUs available, or give a warning if more than one GPU is available, 
should be sufficiently clear what it's doing. This obviously falls apart if you 
compile with `-march=native` and then move it off of the system you compiled it 
for, but the same applies for standard x64 binaries I feel.

Realistically, very, very few casual users are going to be using direct NVPTX 
targeting. The current use-case is for building tests directly for the GPU 
without needing to handle calling `amdgpu-arch` and `nvptx-arch` manually in 
CMake. If I had this in, then I could simplify a lot of CMake code in my `libc` 
project by just letting the compiler handle the autodetection. Then one less 
random program dependency is removed from the build process. AMDGPU already has 
`-mcpu=native` so I'd like NVPTX to match if possible.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libunwind] [clang] [lld] [libcxx] [flang] [mlir] [compiler-rt] [libc] [clang-tools-extra] [llvm] [lldb] [Driver, CodeGen] Support -mtls-dialect= (PR #79256)

2024-01-25 Thread Paul Kirth via lldb-commits

https://github.com/ilovepi approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/79256
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [llvm] [lld] [libc] [lldb] [flang] [compiler-rt] [clang] [clang-tools-extra] [libcxx] Make clang report invalid target versions for all environment types. (PR #78655)

2024-01-25 Thread via lldb-commits

pirama-arumuga-nainar wrote:

@llvm/clang-vendors Adding clang vendors.  FYI, this change expands error 
reporting on invalid version numbers to all target triples (This was previously 
restricted to Android triples).  This can have potential downstream impact.  
Please review/test and let us know if this breaks any valid usages.

https://github.com/llvm/llvm-project/pull/78655
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [compiler-rt] [libunwind] [clang] [llvm] [lld] [libc] [flang] [lldb] [clang-tools-extra] [libcxx] [X86] Use RORX over SHR imm (PR #77964)

2024-01-25 Thread Bryce Wilson via lldb-commits

https://github.com/Bryce-MW updated 
https://github.com/llvm/llvm-project/pull/77964

>From d4c312b9dbf447d0a53dda0e6cdc482bd908430b Mon Sep 17 00:00:00 2001
From: Bryce Wilson 
Date: Fri, 12 Jan 2024 16:01:32 -0600
Subject: [PATCH 01/15] [X86] Use RORX over SHR imm

---
 llvm/lib/Target/X86/X86InstrShiftRotate.td |  78 ++
 llvm/test/CodeGen/X86/atomic-unordered.ll  |   3 +-
 llvm/test/CodeGen/X86/bmi2.ll  |   6 +-
 llvm/test/CodeGen/X86/cmp-shiftX-maskX.ll  |   3 +-
 llvm/test/CodeGen/X86/pr35636.ll   |   4 +-
 llvm/test/CodeGen/X86/vector-trunc-ssat.ll | 116 ++---
 6 files changed, 143 insertions(+), 67 deletions(-)

diff --git a/llvm/lib/Target/X86/X86InstrShiftRotate.td 
b/llvm/lib/Target/X86/X86InstrShiftRotate.td
index f951894db1890cd..238e8e9b6e97f30 100644
--- a/llvm/lib/Target/X86/X86InstrShiftRotate.td
+++ b/llvm/lib/Target/X86/X86InstrShiftRotate.td
@@ -879,6 +879,26 @@ let Predicates = [HasBMI2, HasEGPR, In64BitMode] in {
   defm SHLX64 : bmi_shift<"shlx{q}", GR64, i64mem, "_EVEX">, T8, PD, REX_W, 
EVEX;
 }
 
+
+def immle16_8 : ImmLeaf;
+def immle32_8 : ImmLeaf;
+def immle64_8 : ImmLeaf;
+def immle32_16 : ImmLeaf;
+def immle64_16 : ImmLeaf;
+def immle64_32 : ImmLeaf;
+
 let Predicates = [HasBMI2] in {
   // Prefer RORX which is non-destructive and doesn't update EFLAGS.
   let AddedComplexity = 10 in {
@@ -891,6 +911,64 @@ let Predicates = [HasBMI2] in {
   (RORX32ri GR32:$src, (ROT32L2R_imm8 imm:$shamt))>;
 def : Pat<(rotl GR64:$src, (i8 imm:$shamt)),
   (RORX64ri GR64:$src, (ROT64L2R_imm8 imm:$shamt))>;
+
+// A right shift by less than a smaller register size that is then
+// truncated to that register size can be replaced by RORX to
+// preserve flags with the same execution cost
+
+def : Pat<(i8 (trunc (srl GR16:$src, (i8 immle16_8:$shamt,
+  (EXTRACT_SUBREG (RORX32ri (INSERT_SUBREG (i32 (IMPLICIT_DEF)), 
GR16:$src, sub_16bit), imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (sra GR16:$src, (i8 immle16_8:$shamt,
+  (EXTRACT_SUBREG (RORX32ri (INSERT_SUBREG (i32 (IMPLICIT_DEF)), 
GR16:$src, sub_16bit), imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (srl GR32:$src, (i8 immle32_8:$shamt,
+  (EXTRACT_SUBREG (RORX32ri GR32:$src, imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (sra GR32:$src, (i8 immle32_8:$shamt,
+  (EXTRACT_SUBREG (RORX32ri GR32:$src, imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (srl GR64:$src, (i8 immle64_8:$shamt,
+  (EXTRACT_SUBREG (RORX64ri GR64:$src, imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (sra GR64:$src, (i8 immle64_8:$shamt,
+  (EXTRACT_SUBREG (RORX64ri GR64:$src, imm:$shamt), sub_8bit)>;
+
+
+def : Pat<(i16 (trunc (srl GR32:$src, (i8 immle32_16:$shamt,
+  (EXTRACT_SUBREG (RORX32ri GR32:$src, imm:$shamt), sub_16bit)>;
+def : Pat<(i16 (trunc (sra GR32:$src, (i8 immle32_16:$shamt,
+  (EXTRACT_SUBREG (RORX32ri GR32:$src, imm:$shamt), sub_16bit)>;
+def : Pat<(i16 (trunc (srl GR64:$src, (i8 immle64_16:$shamt,
+  (EXTRACT_SUBREG (RORX64ri GR64:$src, imm:$shamt), sub_16bit)>;
+def : Pat<(i16 (trunc (sra GR64:$src, (i8 immle64_16:$shamt,
+  (EXTRACT_SUBREG (RORX64ri GR64:$src, imm:$shamt), sub_16bit)>;
+
+def : Pat<(i32 (trunc (srl GR64:$src, (i8 immle64_32:$shamt,
+  (EXTRACT_SUBREG (RORX64ri GR64:$src, imm:$shamt), sub_32bit)>;
+def : Pat<(i32 (trunc (sra GR64:$src, (i8 immle64_32:$shamt,
+  (EXTRACT_SUBREG (RORX64ri GR64:$src, imm:$shamt), sub_32bit)>;
+
+
+// Can't expand the load
+def : Pat<(i8 (trunc (srl (loadi32 addr:$src), (i8 immle32_8:$shamt,
+  (EXTRACT_SUBREG (RORX32mi addr:$src, imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (sra (loadi32 addr:$src), (i8 immle32_8:$shamt,
+  (EXTRACT_SUBREG (RORX32mi addr:$src, imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (srl (loadi64 addr:$src), (i8 immle64_8:$shamt,
+  (EXTRACT_SUBREG (RORX64mi addr:$src, imm:$shamt), sub_8bit)>;
+def : Pat<(i8 (trunc (sra (loadi64 addr:$src), (i8 immle64_8:$shamt,
+  (EXTRACT_SUBREG (RORX64mi addr:$src, imm:$shamt), sub_8bit)>;
+
+
+def : Pat<(i16 (trunc (srl (loadi32 addr:$src), (i8 immle32_16:$shamt,
+  (EXTRACT_SUBREG (RORX32mi addr:$src, imm:$shamt), sub_16bit)>;
+def : Pat<(i16 (trunc (sra (loadi32 addr:$src), (i8 immle32_16:$shamt,
+  (EXTRACT_SUBREG (RORX32mi addr:$src, imm:$shamt), sub_16bit)>;
+def : Pat<(i16 (trunc (srl (loadi64 addr:$src), (i8 immle64_16:$shamt,
+  (EXTRACT_SUBREG (RORX64mi addr:$src, imm:$shamt), sub_16bit)>;
+def : Pat<(i16 (trunc (sra (loadi64 addr:$src), (i8 immle64_16:$shamt,
+  (EXTRACT_SUBREG (RORX64mi addr:$src, imm:$shamt), sub_16bit)>;
+
+def : Pat<(i32 (trunc (

[Lldb-commits] [compiler-rt] [libunwind] [clang] [llvm] [lld] [libc] [flang] [lldb] [clang-tools-extra] [libcxx] [X86] Use RORX over SHR imm (PR #77964)

2024-01-25 Thread Bryce Wilson via lldb-commits

Bryce-MW wrote:

I think the fail on Windows is not related. Hopefully a merge fixes it...

https://github.com/llvm/llvm-project/pull/77964
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [mlir] [libcxx] [lld] [flang] [libc] [clang] [llvm] [libunwind] [clang-tools-extra] [compiler-rt] [Driver, CodeGen] Support -mtls-dialect= (PR #79256)

2024-01-25 Thread Fangrui Song via lldb-commits

https://github.com/MaskRay updated 
https://github.com/llvm/llvm-project/pull/79256

>From be08e64c2c1f433b017185ce78525ad097e609be Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Tue, 23 Jan 2024 21:37:04 -0800
Subject: [PATCH 1/2] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20in?=
 =?UTF-8?q?itial=20version?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.4
---
 clang/include/clang/Basic/CodeGenOptions.def |  3 +++
 clang/include/clang/Driver/Options.td|  5 +
 clang/lib/CodeGen/BackendUtil.cpp|  1 +
 clang/lib/Driver/ToolChains/Clang.cpp| 23 
 clang/test/CodeGen/RISCV/tls-dialect.c   | 13 +++
 clang/test/Driver/tls-dialect.c  | 19 
 6 files changed, 64 insertions(+)
 create mode 100644 clang/test/CodeGen/RISCV/tls-dialect.c
 create mode 100644 clang/test/Driver/tls-dialect.c

diff --git a/clang/include/clang/Basic/CodeGenOptions.def 
b/clang/include/clang/Basic/CodeGenOptions.def
index 2f2e45d5cf63df..7c0bfe32849614 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -369,6 +369,9 @@ ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 3, 
llvm::driver::VectorLibr
 /// The default TLS model to use.
 ENUM_CODEGENOPT(DefaultTLSModel, TLSModel, 2, GeneralDynamicTLSModel)
 
+/// Whether to enable TLSDESC. AArch64 enables TLSDESC regardless of this 
value.
+CODEGENOPT(EnableTLSDESC, 1, 0)
+
 /// Bit size of immediate TLS offsets (0 == use the default).
 VALUE_CODEGENOPT(TLSSize, 8, 0)
 
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 7f4fa33748faca..773bc1dcda01d5 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4419,6 +4419,8 @@ def mtls_size_EQ : Joined<["-"], "mtls-size=">, 
Group,
   HelpText<"Specify bit size of immediate TLS offsets (AArch64 ELF only): "
"12 (for 4KB) | 24 (for 16MB, default) | 32 (for 4GB) | 48 (for 
256TB, needs -mcmodel=large)">,
   MarshallingInfoInt>;
+def mtls_dialect_EQ : Joined<["-"], "mtls-dialect=">, Group,
+  Flags<[TargetSpecific]>, HelpText<"Which thread-local storage dialect to use 
for dynamic accesses of TLS variables">;
 def mimplicit_it_EQ : Joined<["-"], "mimplicit-it=">, Group;
 def mdefault_build_attributes : Joined<["-"], "mdefault-build-attributes">, 
Group;
 def mno_default_build_attributes : Joined<["-"], 
"mno-default-build-attributes">, Group;
@@ -7066,6 +7068,9 @@ def fexperimental_assignment_tracking_EQ : Joined<["-"], 
"fexperimental-assignme
   Values<"disabled,enabled,forced">, 
NormalizedValues<["Disabled","Enabled","Forced"]>,
   MarshallingInfoEnum, "Enabled">;
 
+def enable_tlsdesc : Flag<["-"], "enable-tlsdesc">,
+  MarshallingInfoFlag>;
+
 } // let Visibility = [CC1Option]
 
 
//===--===//
diff --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index ec203f6f28bc17..7877e20d77f772 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -401,6 +401,7 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
   Options.UniqueBasicBlockSectionNames =
   CodeGenOpts.UniqueBasicBlockSectionNames;
   Options.TLSSize = CodeGenOpts.TLSSize;
+  Options.EnableTLSDESC = CodeGenOpts.EnableTLSDESC;
   Options.EmulatedTLS = CodeGenOpts.EmulatedTLS;
   Options.DebuggerTuning = CodeGenOpts.getDebuggerTuning();
   Options.EmitStackSizeSection = CodeGenOpts.StackSizeSection;
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 5dc614e11aab59..93fd579eb92ba5 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5822,6 +5822,29 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
 Args.AddLastArg(CmdArgs, options::OPT_mtls_size_EQ);
   }
 
+  if (Arg *A = Args.getLastArg(options::OPT_mtls_dialect_EQ)) {
+StringRef V = A->getValue();
+bool SupportedArgument = false, EnableTLSDESC = false;
+bool Unsupported = !Triple.isOSBinFormatELF();
+if (Triple.isRISCV()) {
+  SupportedArgument = V == "desc" || V == "trad";
+  EnableTLSDESC = V == "desc";
+} else if (Triple.isX86()) {
+  SupportedArgument = V == "gnu";
+} else {
+  Unsupported = true;
+}
+if (Unsupported) {
+  D.Diag(diag::err_drv_unsupported_opt_for_target)
+  << A->getSpelling() << TripleStr;
+} else if (!SupportedArgument) {
+  D.Diag(diag::err_drv_unsupported_option_argument_for_target)
+  << A->getSpelling() << V << TripleStr;
+} else if (EnableTLSDESC) {
+  CmdArgs.push_back("-enable-tlsdesc");
+}
+  }
+
   // Add the target cpu
   std::string CPU = getCPUName(D, Args, Triple, /*FromAs*/ false);
   if (!CPU.empty()) {
diff --git a/clang/test/CodeGen/RISCV

[Lldb-commits] [clang] [lld] [llvm] [libc] [flang] [libcxx] [compiler-rt] [clang-tools-extra] [lldb] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Artem Belevich via lldb-commits

Artem-B wrote:

> I think the semantics of native on other architectures are clear enough here.

I don't think we have the same idea about that. Let's spell it out, so there's 
no confusion.

[GCC 
manual](https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-march-16) 
says:
> Using -march=native enables all instruction subsets supported by the local 
> machine (hence the result might not run on different machines)

The way I read it "all instruction subsets supported by the local machine" 
would be what all-GPUs strategy would do. The binary is expected to run on all 
GPU architecture variants available on the machine.

Granted, gcc was not written with GPUs in mind, but it's a good baseline for 
establishing existing conventions for the meaning of `-march=native`.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [lld] [llvm] [libc] [flang] [libcxx] [compiler-rt] [clang-tools-extra] [lldb] Make clang report invalid target versions for all environment types. (PR #78655)

2024-01-25 Thread Fangrui Song via lldb-commits


@@ -1443,15 +1443,17 @@ Compilation *Driver::BuildCompilation(ArrayRef ArgList) {
   const ToolChain &TC = getToolChain(
   *UArgs, computeTargetTriple(*this, TargetTriple, *UArgs));
 
-  if (TC.getTriple().isAndroid()) {
-llvm::Triple Triple = TC.getTriple();
-StringRef TripleVersionName = Triple.getEnvironmentVersionString();
-
-if (Triple.getEnvironmentVersion().empty() && TripleVersionName != "") {
-  Diags.Report(diag::err_drv_triple_version_invalid)
-  << TripleVersionName << TC.getTripleString();
-  ContainsError = true;
-}
+  // Check if the environment version is valid.
+  llvm::Triple Triple = TC.getTriple();
+  StringRef TripleVersionName = Triple.getEnvironmentVersionString();
+  StringRef TripleObjectFormat =
+  Triple.getObjectFormatTypeName(Triple.getObjectFormat());
+

MaskRay wrote:

(the prevailing code style does not insert a blank line in this case. )

https://github.com/llvm/llvm-project/pull/78655
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [lld] [llvm] [libc] [flang] [libcxx] [compiler-rt] [clang-tools-extra] [lldb] Make clang report invalid target versions for all environment types. (PR #78655)

2024-01-25 Thread Fangrui Song via lldb-commits

https://github.com/MaskRay edited 
https://github.com/llvm/llvm-project/pull/78655
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libc] [lld] [flang] [libcxx] [clang] [clang-tools-extra] [compiler-rt] [llvm] [lldb] Make clang report invalid target versions for all environment types. (PR #78655)

2024-01-25 Thread Fangrui Song via lldb-commits


@@ -255,7 +255,7 @@ class Triple {
 Cygnus,
 CoreCLR,
 Simulator, // Simulator variants of other systems, e.g., Apple's iOS
-MacABI, // Mac Catalyst variant of Apple's iOS deployment target.
+MacABI,// Mac Catalyst variant of Apple's iOS deployment target.

MaskRay wrote:

Revert this difference? You can ignore clang-format reports.

https://github.com/llvm/llvm-project/pull/78655
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [clang-tools-extra] [lldb] [libc] [libcxx] [lld] [llvm] [flang] [compiler-rt] Make clang report invalid target versions for all environment types. (PR #78655)

2024-01-25 Thread Fangrui Song via lldb-commits


@@ -276,7 +276,7 @@ class Triple {
 Callable,
 Mesh,
 Amplification,
-
+OpenCL,

MaskRay wrote:

I wonder why we need this addition. This is not mentioned in the description.

https://github.com/llvm/llvm-project/pull/78655
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [clang-tools-extra] [lldb] [libc] [libcxx] [lld] [llvm] [flang] [compiler-rt] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> > I think the semantics of native on other architectures are clear enough 
> > here.
> 
> I don't think we have the same idea about that. Let's spell it out, so 
> there's no confusion.
> 
> [GCC 
> manual](https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html#index-march-16) 
> says:
> 
> > Using -march=native enables all instruction subsets supported by the local 
> > machine (hence the result might not run on different machines)
> 
> The way I read it "all instruction subsets supported by the local machine" 
> would be what all-GPUs strategy would do. The binary is expected to run on 
> all GPU architecture variants available on the machine.
> 
> Granted, gcc was not written with GPUs in mind, but it's a good baseline for 
> establishing existing conventions for the meaning of `-march=native`.

This more or less depends on what your definition of "local machine" is when it 
comes to a system augmented with GPUs. The verbiage of "**The** local machine" 
implies an assumption that there is only one, which I personally find 
consistent with just selecting the first GPU found on the system. There is 
ambiguity in how we should treat this in the case of multiple GPUs, but that's 
what the warning message is for. it informs the user that the "native" 
architecture is somewhat ambiguous and that the first one was selected.

Further, our current default makes sense, because it corresponds to Device ID 
zero in CUDA, which means that unless you change the environment via 
`CUDA_VISIBLE_DEVICES` or something, it will work on the default device.

So, in the case there is one device, the behavior is consistent with 
`-march=native`. In the case where there are two, we make an implicit decision 
to target the first GPU and inform the user. This method of compilation is not 
like CUDA, so we can't target all the GPUs at the same time. This will be 
useful in cases where we want to write code that simply targets a GPU that will 
"work". We have CMake code around LLVM already to do this, so it would be nice 
to get rid of that.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [lld] [llvm] [libc] [flang] [libcxx] [compiler-rt] [clang-tools-extra] [lldb] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread Florian Hahn via lldb-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/73158

>From 13a26e8e7440c3b501730b22588af393a3e543cd Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Thu, 6 Jul 2023 08:07:45 +0100
Subject: [PATCH 1/3] [VPlan] Implement cloning of VPlans.

This patch implements cloning for VPlans and recipes. Cloning is used in
the epilogue vectorization path, to clone the VPlan for the main vector
loop. This means we won't re-use a VPlan when executing the VPlan for
the epilogue vector loop, which in turn will enable us to perform
optimizations based on UF & VF.
---
 .../Transforms/Vectorize/LoopVectorize.cpp|   2 +-
 llvm/lib/Transforms/Vectorize/VPlan.cpp   | 124 
 llvm/lib/Transforms/Vectorize/VPlan.h | 182 ++
 .../Transforms/Vectorize/VPlanTest.cpp|   2 +
 4 files changed, 309 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 10c068e3b5895c..9ffd44d59ffc6d 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -10078,7 +10078,7 @@ bool LoopVectorizePass::processLoop(Loop *L) {
 EpilogueVectorizerMainLoop MainILV(L, PSE, LI, DT, TLI, TTI, AC, ORE,
EPI, &LVL, &CM, BFI, PSI, Checks);
 
-VPlan &BestMainPlan = LVP.getBestPlanFor(EPI.MainLoopVF);
+VPlan &BestMainPlan = *LVP.getBestPlanFor(EPI.MainLoopVF).clone();
 const auto &[ExpandedSCEVs, ReductionResumeValues] = LVP.executePlan(
 EPI.MainLoopVF, EPI.MainLoopUF, BestMainPlan, MainILV, DT, true);
 ++LoopsVectorized;
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.cpp 
b/llvm/lib/Transforms/Vectorize/VPlan.cpp
index b6e56c47c227f7..99b2a3bd59a64d 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlan.cpp
@@ -615,6 +615,18 @@ void VPBasicBlock::print(raw_ostream &O, const Twine 
&Indent,
 }
 #endif
 
+VPBlockBase *VPRegionBlock::clone() {
+  DenseMap Old2New;
+  DenseMap Old2NewVPValues;
+  VPBlockBase *NewEntry =
+  VPBlockUtils::cloneCFG(Entry, Old2New, Old2NewVPValues);
+  auto *NewR =
+  new VPRegionBlock(NewEntry, Old2New[Exiting], getName(), isReplicator());
+  for (VPBlockBase *Block : vp_depth_first_shallow(NewEntry))
+Block->setParent(NewR);
+  return NewR;
+}
+
 void VPRegionBlock::dropAllReferences(VPValue *NewValue) {
   for (VPBlockBase *Block : vp_depth_first_shallow(Entry))
 // Drop all references in VPBasicBlocks and replace all uses with
@@ -982,6 +994,65 @@ void VPlan::updateDominatorTree(DominatorTree *DT, 
BasicBlock *LoopHeaderBB,
   assert(DT->verify(DominatorTree::VerificationLevel::Fast));
 }
 
+static void remapVPValues(VPBasicBlock *OldBB, VPBasicBlock *NewBB,
+  DenseMap &Old2NewVPValues,
+  bool Full = false) {
+  for (const auto &[OldR, NewR] : zip(*OldBB, *NewBB)) {
+for (unsigned I = 0, E = NewR.getNumOperands(); I != E; ++I) {
+  VPValue *NewOp = Old2NewVPValues.lookup(OldR.getOperand(I));
+  if (!Full)
+continue;
+  NewR.setOperand(I, NewOp);
+}
+for (const auto &[OldV, NewV] :
+ zip(OldR.definedValues(), NewR.definedValues()))
+  Old2NewVPValues[OldV] = NewV;
+  }
+}
+
+VPlan *VPlan::clone() {
+  DenseMap Old2New;
+  DenseMap Old2NewVPValues;
+
+  auto *NewPlan = new VPlan();
+  SmallVector NewLiveIns;
+  for (VPValue *LI : VPLiveInsToFree) {
+VPValue *NewLI = new VPValue(LI->getLiveInIRValue());
+NewPlan->VPLiveInsToFree.push_back(NewLI);
+Old2NewVPValues[LI] = NewLI;
+  }
+
+  Old2NewVPValues[&VectorTripCount] = &NewPlan->VectorTripCount;
+  Old2NewVPValues[&VFxUF] = &NewPlan->VFxUF;
+  if (BackedgeTakenCount) {
+Old2NewVPValues[BackedgeTakenCount] = new VPValue();
+NewPlan->BackedgeTakenCount = Old2NewVPValues[BackedgeTakenCount];
+  }
+
+  auto NewPH = cast(Preheader->clone());
+  remapVPValues(cast(Preheader), cast(NewPH),
+Old2NewVPValues, /*Full*/ true);
+  VPValue *NewTC = Old2NewVPValues.lookup(TripCount);
+  if (!NewTC)
+Old2NewVPValues[TripCount] = new VPValue(TripCount->getLiveInIRValue());
+  NewPlan->TripCount = Old2NewVPValues[TripCount];
+
+  auto *NewEntry = cast(VPBlockUtils::cloneCFG(
+  getEntry(), Old2New, Old2NewVPValues, /*FullRemapping*/ true));
+
+  NewPlan->Entry = NewEntry;
+  NewPlan->Preheader = NewPH;
+  NewEntry->setPlan(NewPlan);
+  NewPH->setPlan(NewPlan);
+  NewPlan->VFs = VFs;
+  NewPlan->UFs = UFs;
+  NewPlan->Name = Name;
+
+  for (const auto &[_, LO] : LiveOuts)
+NewPlan->addLiveOut(LO->getPhi(), Old2NewVPValues[LO->getOperand(0)]);
+  return NewPlan;
+}
+
 #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
 
 Twine VPlanPrinter::getUID(const VPBlockBase *Block) {
@@ -1200,6 +1271,59 @@ void VPUser::printOperands(raw_ostream &O, VPSlotTracker 
&SlotTracker) const {
 }
 #endif
 

[Lldb-commits] [libcxx] [flang] [lldb] [clang] [clang-tools-extra] [lld] [llvm] [compiler-rt] [libc] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Artem Belevich via lldb-commits

Artem-B wrote:

> This method of compilation is not like CUDA, so we can't target all the GPUs 
> at the same time.

I think this is the key fact I was missing. If the patch is only for a 
standalone compilation which does not do multi-GPU compilation in principle, 
then your approach makes sense.

I was arguing from the normal offloading which does have ability to target 
multiple GPUs.


https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [flang] [clang] [clang-tools-extra] [llvm] [compiler-rt] [libcxx] [libc] [lldb] [lld] Make clang report invalid target versions for all environment types. (PR #78655)

2024-01-25 Thread via lldb-commits


@@ -276,7 +276,7 @@ class Triple {
 Callable,
 Mesh,
 Amplification,
-
+OpenCL,

ZijunZhaoCCK wrote:

Some cases like 
https://github.com/llvm/llvm-project/blob/main/clang/test/CodeGenOpenCL/amdgpu-alignment.cl#L3

https://github.com/llvm/llvm-project/pull/78655
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [flang] [clang] [clang-tools-extra] [llvm] [compiler-rt] [libcxx] [libc] [lldb] [lld] [NVPTX] Add support for -march=native in standalone NVPTX (PR #79373)

2024-01-25 Thread Joseph Huber via lldb-commits

jhuber6 wrote:

> > This method of compilation is not like CUDA, so we can't target all the 
> > GPUs at the same time.
> 
> I think this is the key fact I was missing. If the patch is only for a 
> standalone compilation which does not do multi-GPU compilation in principle, 
> then your approach makes sense.
> 
> I was arguing from the normal offloading which does have ability to target 
> multiple GPUs.

Yes, this is more similar to OpenCL or just regular CPU compilation where we 
have a single job that creates a simple executable, terminal application style. 
So given a single target, the desire is to "pick me the one that will work on 
the default CUDA device without me needing to check." type thing.

https://github.com/llvm/llvm-project/pull/79373
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [lld] [llvm] [libc] [flang] [libcxx] [compiler-rt] [clang-tools-extra] [lldb] [-Wunsafe-buffer-usage] Fix AST matcher of UUCAddAssignGadget (PR #79392)

2024-01-25 Thread via lldb-commits

https://github.com/jkorous-apple updated 
https://github.com/llvm/llvm-project/pull/79392

>From dcc2b0c07681b57dbd5a82ce83f5166bb3b9ee09 Mon Sep 17 00:00:00 2001
From: Jan Korous 
Date: Wed, 24 Jan 2024 15:02:55 -0800
Subject: [PATCH] [-Wunsafe-buffer-usage] Fix AST matcher of UUCAddAssignGadget

We are not interested in nonpointers being added to.
---
 clang/lib/Analysis/UnsafeBufferUsage.cpp | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/clang/lib/Analysis/UnsafeBufferUsage.cpp 
b/clang/lib/Analysis/UnsafeBufferUsage.cpp
index 7df706beb22662c..9046491c9e86536 100644
--- a/clang/lib/Analysis/UnsafeBufferUsage.cpp
+++ b/clang/lib/Analysis/UnsafeBufferUsage.cpp
@@ -1081,11 +1081,16 @@ class UUCAddAssignGadget : public FixableGadget {
   }
 
   static Matcher matcher() {
+// clang-format off
 return stmt(isInUnspecifiedUntypedContext(expr(ignoringImpCasts(
 binaryOperator(hasOperatorName("+="),
-   hasLHS(declRefExpr(toSupportedVariable())),
+   hasLHS(
+declRefExpr(
+  hasPointerType(),
+  toSupportedVariable())),
hasRHS(expr().bind(OffsetTag)))
 .bind(UUCAddAssignTag);
+// clang-format on
   }
 
   virtual std::optional getFixits(const Strategy &S) const override;

___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang-tools-extra] [clang] [compiler-rt] [flang] [libcxx] [lldb] [lld] [llvm] [libc] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -2694,6 +2852,9 @@ class VPlan {
   /// been modeled in VPlan directly.
   DenseMap SCEVToExpansion;
 
+  /// Construct an uninitialized VPlan, should be used for cloning only.
+  explicit VPlan() = default;
+

fhahn wrote:

Removed, thanks!

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libcxx] [flang] [llvm] [clang] [lldb] [clang-tools-extra] [compiler-rt] [lld] [libc] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -614,6 +614,61 @@ void VPBasicBlock::print(raw_ostream &O, const Twine 
&Indent,
   printSuccessors(O, Indent);
 }
 #endif
+static void cloneCFG(VPBlockBase *Entry,
+ DenseMap &Old2NewVPBlocks);

fhahn wrote:

Updated as suggested and renamed, thanks!

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang] [lld] [llvm] [libc] [flang] [libcxx] [compiler-rt] [clang-tools-extra] [lldb] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -982,6 +1037,94 @@ void VPlan::updateDominatorTree(DominatorTree *DT, 
BasicBlock *LoopHeaderBB,
   assert(DT->verify(DominatorTree::VerificationLevel::Fast));
 }
 
+static void remapOperands(VPBlockBase *Entry, VPBlockBase *NewEntry,
+  DenseMap &Old2NewVPValues) {
+  // Update the operands of all cloned recipes starting at NewEntry. This
+  // traverses all reachable blocks. This is done in two steps, to handle 
cycles
+  // in PHI recipes.
+  ReversePostOrderTraversal>
+  OldDeepRPOT(Entry);
+  ReversePostOrderTraversal>
+  NewDeepRPOT(NewEntry);
+  // First, collect all mappings from old to new VPValues defined by cloned
+  // recipes.
+  for (const auto &[OldBB, NewBB] :
+   zip(VPBlockUtils::blocksOnly(OldDeepRPOT),
+   VPBlockUtils::blocksOnly(NewDeepRPOT))) {
+assert(OldBB->getRecipeList().size() == NewBB->getRecipeList().size() &&
+   "blocks must have the same number of recipes");
+
+for (const auto &[OldR, NewR] : zip(*OldBB, *NewBB)) {
+  assert(OldR.getNumOperands() == NewR.getNumOperands() &&
+ "recipes must have the same number of operands");
+  assert(OldR.getNumDefinedValues() == NewR.getNumDefinedValues() &&
+ "recipes must define the same number of operands");
+  for (const auto &[OldV, NewV] :
+   zip(OldR.definedValues(), NewR.definedValues()))
+Old2NewVPValues[OldV] = NewV;
+}
+  }
+
+  // Update all operands to use cloned VPValues.
+  for (VPBasicBlock *NewBB :
+   VPBlockUtils::blocksOnly(NewDeepRPOT)) {
+for (VPRecipeBase &NewR : *NewBB)
+  for (unsigned I = 0, E = NewR.getNumOperands(); I != E; ++I) {
+VPValue *NewOp = Old2NewVPValues.lookup(NewR.getOperand(I));
+NewR.setOperand(I, NewOp);
+  }
+  }
+}
+
+VPlan *VPlan::clone() {
+  DenseMap Old2NewVPBlocks;
+  DenseMap Old2NewVPValues;
+
+  auto *NewPlan = new VPlan();

fhahn wrote:

Reordered as suggested, thanks! 

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [clang-tools-extra] [libc] [libcxx] [llvm] [compiler-rt] [lldb] [lld] [flang] [clang] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -1594,6 +1657,13 @@ class VPWidenPHIRecipe : public VPHeaderPHIRecipe {
   addOperand(Start);
   }
 
+  VPRecipeBase *clone() override {
+auto *Res = new VPWidenPHIRecipe(cast(getUnderlyingInstr()),

fhahn wrote:

Changed to `llvm_unreachable`, thanks!

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [libcxx] [libc] [compiler-rt] [clang-tools-extra] [lld] [llvm] [lldb] [clang] [flang] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -982,6 +1037,94 @@ void VPlan::updateDominatorTree(DominatorTree *DT, 
BasicBlock *LoopHeaderBB,
   assert(DT->verify(DominatorTree::VerificationLevel::Fast));
 }
 
+static void remapOperands(VPBlockBase *Entry, VPBlockBase *NewEntry,
+  DenseMap &Old2NewVPValues) {
+  // Update the operands of all cloned recipes starting at NewEntry. This
+  // traverses all reachable blocks. This is done in two steps, to handle 
cycles
+  // in PHI recipes.
+  ReversePostOrderTraversal>
+  OldDeepRPOT(Entry);
+  ReversePostOrderTraversal>
+  NewDeepRPOT(NewEntry);
+  // First, collect all mappings from old to new VPValues defined by cloned
+  // recipes.
+  for (const auto &[OldBB, NewBB] :
+   zip(VPBlockUtils::blocksOnly(OldDeepRPOT),
+   VPBlockUtils::blocksOnly(NewDeepRPOT))) {
+assert(OldBB->getRecipeList().size() == NewBB->getRecipeList().size() &&
+   "blocks must have the same number of recipes");
+
+for (const auto &[OldR, NewR] : zip(*OldBB, *NewBB)) {
+  assert(OldR.getNumOperands() == NewR.getNumOperands() &&
+ "recipes must have the same number of operands");
+  assert(OldR.getNumDefinedValues() == NewR.getNumDefinedValues() &&
+ "recipes must define the same number of operands");
+  for (const auto &[OldV, NewV] :
+   zip(OldR.definedValues(), NewR.definedValues()))
+Old2NewVPValues[OldV] = NewV;
+}
+  }
+
+  // Update all operands to use cloned VPValues.
+  for (VPBasicBlock *NewBB :
+   VPBlockUtils::blocksOnly(NewDeepRPOT)) {
+for (VPRecipeBase &NewR : *NewBB)
+  for (unsigned I = 0, E = NewR.getNumOperands(); I != E; ++I) {
+VPValue *NewOp = Old2NewVPValues.lookup(NewR.getOperand(I));
+NewR.setOperand(I, NewOp);
+  }
+  }
+}
+
+VPlan *VPlan::clone() {
+  DenseMap Old2NewVPBlocks;
+  DenseMap Old2NewVPValues;
+
+  auto *NewPlan = new VPlan();
+
+  // Clone live-ins.
+  SmallVector NewLiveIns;
+  for (VPValue *OldLiveIn : VPLiveInsToFree) {
+VPValue *NewLiveIn = new VPValue(OldLiveIn->getLiveInIRValue());
+NewPlan->VPLiveInsToFree.push_back(NewLiveIn);
+Old2NewVPValues[OldLiveIn] = NewLiveIn;
+  }
+  Old2NewVPValues[&VectorTripCount] = &NewPlan->VectorTripCount;
+  Old2NewVPValues[&VFxUF] = &NewPlan->VFxUF;
+  if (BackedgeTakenCount) {
+NewPlan->BackedgeTakenCount = new VPValue();
+Old2NewVPValues[BackedgeTakenCount] = NewPlan->BackedgeTakenCount;
+  }
+  assert(TripCount && "trip count must be set");
+  if (TripCount->isLiveIn())
+Old2NewVPValues[TripCount] = new VPValue(TripCount->getLiveInIRValue());

fhahn wrote:

Added, thanks!

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lldb] [flang] [clang-tools-extra] [clang] [libc] [libcxx] [lld] [compiler-rt] [llvm] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -982,6 +1037,94 @@ void VPlan::updateDominatorTree(DominatorTree *DT, 
BasicBlock *LoopHeaderBB,
   assert(DT->verify(DominatorTree::VerificationLevel::Fast));
 }
 
+static void remapOperands(VPBlockBase *Entry, VPBlockBase *NewEntry,
+  DenseMap &Old2NewVPValues) {
+  // Update the operands of all cloned recipes starting at NewEntry. This
+  // traverses all reachable blocks. This is done in two steps, to handle 
cycles
+  // in PHI recipes.
+  ReversePostOrderTraversal>
+  OldDeepRPOT(Entry);
+  ReversePostOrderTraversal>
+  NewDeepRPOT(NewEntry);
+  // First, collect all mappings from old to new VPValues defined by cloned
+  // recipes.
+  for (const auto &[OldBB, NewBB] :
+   zip(VPBlockUtils::blocksOnly(OldDeepRPOT),
+   VPBlockUtils::blocksOnly(NewDeepRPOT))) {
+assert(OldBB->getRecipeList().size() == NewBB->getRecipeList().size() &&
+   "blocks must have the same number of recipes");
+
+for (const auto &[OldR, NewR] : zip(*OldBB, *NewBB)) {
+  assert(OldR.getNumOperands() == NewR.getNumOperands() &&
+ "recipes must have the same number of operands");
+  assert(OldR.getNumDefinedValues() == NewR.getNumDefinedValues() &&
+ "recipes must define the same number of operands");
+  for (const auto &[OldV, NewV] :
+   zip(OldR.definedValues(), NewR.definedValues()))
+Old2NewVPValues[OldV] = NewV;
+}
+  }
+
+  // Update all operands to use cloned VPValues.
+  for (VPBasicBlock *NewBB :
+   VPBlockUtils::blocksOnly(NewDeepRPOT)) {
+for (VPRecipeBase &NewR : *NewBB)
+  for (unsigned I = 0, E = NewR.getNumOperands(); I != E; ++I) {
+VPValue *NewOp = Old2NewVPValues.lookup(NewR.getOperand(I));
+NewR.setOperand(I, NewOp);
+  }
+  }
+}
+
+VPlan *VPlan::clone() {
+  DenseMap Old2NewVPBlocks;
+  DenseMap Old2NewVPValues;
+
+  auto *NewPlan = new VPlan();
+
+  // Clone live-ins.
+  SmallVector NewLiveIns;

fhahn wrote:

Not in the latest version, removed, thanks!

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [lld] [lldb] [libcxx] [compiler-rt] [clang-tools-extra] [llvm] [libc] [clang] [flang] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -614,6 +614,61 @@ void VPBasicBlock::print(raw_ostream &O, const Twine 
&Indent,
   printSuccessors(O, Indent);
 }
 #endif
+static void cloneCFG(VPBlockBase *Entry,
+ DenseMap &Old2NewVPBlocks);
+
+static VPBlockBase *cloneVPB(VPBlockBase *BB) {
+  if (auto *VPBB = dyn_cast(BB)) {
+auto *NewBlock = new VPBasicBlock(VPBB->getName());
+for (VPRecipeBase &R : *VPBB)
+  NewBlock->appendRecipe(R.clone());
+return NewBlock;
+  }
+
+  auto *VPR = cast(BB);
+  DenseMap Old2NewVPBlocks;
+  DenseMap Old2NewVPValues;

fhahn wrote:

Not needed in the latest version, removed, thanks!

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


[Lldb-commits] [llvm] [clang-tools-extra] [libcxx] [lld] [compiler-rt] [libc] [clang] [flang] [lldb] [VPlan] Implement cloning of VPlans. (PR #73158)

2024-01-25 Thread Florian Hahn via lldb-commits


@@ -614,6 +614,61 @@ void VPBasicBlock::print(raw_ostream &O, const Twine 
&Indent,
   printSuccessors(O, Indent);
 }
 #endif
+static void cloneCFG(VPBlockBase *Entry,
+ DenseMap &Old2NewVPBlocks);
+
+static VPBlockBase *cloneVPB(VPBlockBase *BB) {
+  if (auto *VPBB = dyn_cast(BB)) {
+auto *NewBlock = new VPBasicBlock(VPBB->getName());
+for (VPRecipeBase &R : *VPBB)
+  NewBlock->appendRecipe(R.clone());
+return NewBlock;
+  }
+
+  auto *VPR = cast(BB);
+  DenseMap Old2NewVPBlocks;
+  DenseMap Old2NewVPValues;
+  cloneCFG(VPR->getEntry(), Old2NewVPBlocks);
+  VPBlockBase *NewEntry = Old2NewVPBlocks[VPR->getEntry()];
+  auto *NewRegion =
+  new VPRegionBlock(NewEntry, Old2NewVPBlocks[VPR->getExiting()],
+VPR->getName(), VPR->isReplicator());
+  for (VPBlockBase *Block : vp_depth_first_shallow(NewEntry))
+Block->setParent(NewRegion);
+  return NewRegion;
+}
+
+// Clone the CFG for all nodes reachable from \p Entry, this includes cloning
+// the blocks and their recipes. Operands of cloned recipes will NOT be 
updated.
+// Remapping of operands must be done separately.
+static void cloneCFG(VPBlockBase *Entry,
+ DenseMap &Old2NewVPBlocks) {
+  ReversePostOrderTraversal> 
RPOT(
+  Entry);
+  for (VPBlockBase *BB : RPOT) {
+VPBlockBase *NewBB = cloneVPB(BB);
+for (VPBlockBase *Pred : BB->getPredecessors())
+  VPBlockUtils::connectBlocks(Old2NewVPBlocks[Pred], NewBB);
+
+Old2NewVPBlocks[BB] = NewBB;
+  }
+
+#if !defined(NDEBUG)
+  // Verify that the order of predecessors and successors matches in the cloned
+  // version.
+  ReversePostOrderTraversal>
+  NewRPOT(Old2NewVPBlocks[Entry]);
+  for (const auto &[OldBB, NewBB] : zip(RPOT, NewRPOT)) {
+for (const auto &[OldPred, NewPred] :
+ zip(OldBB->getPredecessors(), NewBB->getPredecessors()))
+  assert(NewPred == Old2NewVPBlocks[OldPred] && "Different predecessors");
+
+for (const auto &[OldSucc, NewSucc] :
+ zip(OldBB->successors(), NewBB->successors()))
+  assert(NewSucc == Old2NewVPBlocks[OldSucc] && "Different successors");
+  }
+#endif

fhahn wrote:

Updated, thanks!

https://github.com/llvm/llvm-project/pull/73158
___
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits


  1   2   >