[llvm-branch-commits] [clang] PR for llvm/llvm-project#79564 (PR #79566)

2024-01-26 Thread David Sherwood via llvm-branch-commits

https://github.com/david-arm approved this pull request.

LGTM!

https://github.com/llvm/llvm-project/pull/79566
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#80140 (PR #80141)

2024-01-31 Thread David Sherwood via llvm-branch-commits

https://github.com/david-arm approved this pull request.

LGTM. This is a critical fix for SME to ensure correct behaviour and prevent 
stack corruption.

https://github.com/llvm/llvm-project/pull/80141
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] 6584a9a - [release][docs] Update contributions to LLVM 12 for scalable vectors.

2021-02-18 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2021-02-18T09:07:28Z
New Revision: 6584a9a4c55e10c055f9f450798b826a9624d82f

URL: 
https://github.com/llvm/llvm-project/commit/6584a9a4c55e10c055f9f450798b826a9624d82f
DIFF: 
https://github.com/llvm/llvm-project/commit/6584a9a4c55e10c055f9f450798b826a9624d82f.diff

LOG: [release][docs] Update contributions to LLVM 12 for scalable vectors.

Differential Revision: https://reviews.llvm.org/D96270

Added: 


Modified: 
clang/docs/ReleaseNotes.rst

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index f4ca8a855142..a43cc33988ab 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -144,6 +144,18 @@ New Pragmas in Clang
 
 - ...
 
+Modified Pragmas in Clang
+-
+
+- The "#pragma clang loop vectorize_width" has been extended to support an
+  optional 'fixed|scalable' argument, which can be used to indicate that the
+  compiler should use fixed-width or scalable vectorization.  Fixed-width is
+  assumed by default.
+
+  Scalable or vector length agnostic vectorization is an experimental feature
+  for targets that support scalable vectors. For more information please refer
+  to the Clang Language Extensions documentation.
+
 Attribute Changes in Clang
 --
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 4cd4853 - [NFC][InstructionCost] Use InstructionCost in Transforms/Scalar/RewriteStatepointsForGC.cpp

2021-01-13 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2021-01-13T09:42:58Z
New Revision: 4cd48535eca06245c89a9158844bb177c6f8eb63

URL: 
https://github.com/llvm/llvm-project/commit/4cd48535eca06245c89a9158844bb177c6f8eb63
DIFF: 
https://github.com/llvm/llvm-project/commit/4cd48535eca06245c89a9158844bb177c6f8eb63.diff

LOG: [NFC][InstructionCost] Use InstructionCost in 
Transforms/Scalar/RewriteStatepointsForGC.cpp

In places where we calculate costs using TTI.getXXXCost() interfaces
I have changed the code to use InstructionCost instead of unsigned.
The change is non functional since InstructionCost behaves in the
same way as an integer for valid costs. Currently the getXXXCost()
functions used in this file do not return invalid costs.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: 
http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Differential revision: https://reviews.llvm.org/D94484

Added: 


Modified: 
llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp 
b/llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
index 68ddebf113d1..6a95ec3a6576 100644
--- a/llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
+++ b/llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
@@ -2110,10 +2110,10 @@ static Value* findRematerializableChainToBasePointer(
 
 // Helper function for the "rematerializeLiveValues". Compute cost of the use
 // chain we are going to rematerialize.
-static unsigned
-chainToBasePointerCost(SmallVectorImpl &Chain,
+static InstructionCost
+chainToBasePointerCost(SmallVectorImpl &Chain,
TargetTransformInfo &TTI) {
-  unsigned Cost = 0;
+  InstructionCost Cost = 0;
 
   for (Instruction *Instr : Chain) {
 if (CastInst *CI = dyn_cast(Instr)) {
@@ -2220,7 +2220,7 @@ static void rematerializeLiveValues(CallBase *Call,
   assert(Info.LiveSet.count(AlternateRootPhi));
 }
 // Compute cost of this chain
-unsigned Cost = chainToBasePointerCost(ChainToBase, TTI);
+InstructionCost Cost = chainToBasePointerCost(ChainToBase, TTI);
 // TODO: We can also account for cases when we will be able to remove some
 //   of the rematerialized values by later optimization passes. I.e if
 //   we rematerialized several intersecting chains. Or if original 
values



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] c3ce262 - [NFC] Make remaining cost functions in LoopVectorize.cpp use InstructionCost

2021-01-19 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2021-01-19T09:08:40Z
New Revision: c3ce2627949eee3b5d3012db78f670919a49b35d

URL: 
https://github.com/llvm/llvm-project/commit/c3ce2627949eee3b5d3012db78f670919a49b35d
DIFF: 
https://github.com/llvm/llvm-project/commit/c3ce2627949eee3b5d3012db78f670919a49b35d.diff

LOG: [NFC] Make remaining cost functions in LoopVectorize.cpp use 
InstructionCost

A previous patch has already changed getInstructionCost to return
an InstructionCost type. This patch changes the other various
getXXXCost functions to return an InstructionCost too. This is a
non-functional change - I've added a few asserts that the costs
are valid in places where we're selecting between vector call
and intrinsic costs. However, since we don't yet return invalid
costs from any of the TTI implementations these asserts should
not fire.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: 
http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Differential Revision: https://reviews.llvm.org/D94065

Added: 


Modified: 
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 5ae400fb5dc9..50e4ef01b616 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -1385,7 +1385,7 @@ class LoopVectorizationCostModel {
   /// Save vectorization decision \p W and \p Cost taken by the cost model for
   /// instruction \p I and vector width \p VF.
   void setWideningDecision(Instruction *I, ElementCount VF, InstWidening W,
-   unsigned Cost) {
+   InstructionCost Cost) {
 assert(VF.isVector() && "Expected VF >=2");
 WideningDecisions[std::make_pair(I, VF)] = std::make_pair(W, Cost);
   }
@@ -1393,7 +1393,8 @@ class LoopVectorizationCostModel {
   /// Save vectorization decision \p W and \p Cost taken by the cost model for
   /// interleaving group \p Grp and vector width \p VF.
   void setWideningDecision(const InterleaveGroup *Grp,
-   ElementCount VF, InstWidening W, unsigned Cost) {
+   ElementCount VF, InstWidening W,
+   InstructionCost Cost) {
 assert(VF.isVector() && "Expected VF >=2");
 /// Broadcast this decicion to all instructions inside the group.
 /// But the cost will be assigned to one instruction only.
@@ -1426,7 +1427,7 @@ class LoopVectorizationCostModel {
 
   /// Return the vectorization cost for the given instruction \p I and vector
   /// width \p VF.
-  unsigned getWideningCost(Instruction *I, ElementCount VF) {
+  InstructionCost getWideningCost(Instruction *I, ElementCount VF) {
 assert(VF.isVector() && "Expected VF >=2");
 std::pair InstOnVF = std::make_pair(I, VF);
 assert(WideningDecisions.find(InstOnVF) != WideningDecisions.end() &&
@@ -1604,15 +1605,15 @@ class LoopVectorizationCostModel {
   /// Estimate cost of an intrinsic call instruction CI if it were vectorized
   /// with factor VF.  Return the cost of the instruction, including
   /// scalarization overhead if it's needed.
-  unsigned getVectorIntrinsicCost(CallInst *CI, ElementCount VF);
+  InstructionCost getVectorIntrinsicCost(CallInst *CI, ElementCount VF);
 
   /// Estimate cost of a call instruction CI if it were vectorized with factor
   /// VF. Return the cost of the instruction, including scalarization overhead
   /// if it's needed. The flag NeedToScalarize shows if the call needs to be
   /// scalarized -
   /// i.e. either vector version isn't available, or is too expensive.
-  unsigned getVectorCallCost(CallInst *CI, ElementCount VF,
- bool &NeedToScalarize);
+  InstructionCost getVectorCallCost(CallInst *CI, ElementCount VF,
+bool &NeedToScalarize);
 
   /// Invalidates decisions already taken by the cost model.
   void invalidateCostModelingDecisions() {
@@ -1655,30 +1656,30 @@ class LoopVectorizationCostModel {
  Type *&VectorTy);
 
   /// Calculate vectorization cost of memory instruction \p I.
-  unsigned getMemoryInstructionCost(Instruction *I, ElementCount VF);
+  InstructionCost getMemoryInstructionCost(Instruction *I, ElementCount VF);
 
   /// The cost computation for scalarized memory instruction.
-  unsigned getMemInstScalarizationCost(Instruction *I, ElementCount VF);
+  InstructionCost getMemInstScalarizationCost(Instruction *I, ElementCount VF);
 
   /// The cost computation for interleaving group of memory instructions.
-  unsigned getInterleaveGroupCost(Instruction *I, ElementCount VF);
+  InstructionCost getInterleaveGroupCost(Instruction *I, ElementCount VF);
 
   /// The cost computation for Gather/Scatter instruction.
- 

[llvm-branch-commits] [llvm] 255a507 - [NFC][InstructionCost] Use InstructionCost in lib/Transforms/IPO/IROutliner.cpp

2021-01-20 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2021-01-20T08:33:59Z
New Revision: 255a507716bca63a375f3b8a379ccbbc58cb40da

URL: 
https://github.com/llvm/llvm-project/commit/255a507716bca63a375f3b8a379ccbbc58cb40da
DIFF: 
https://github.com/llvm/llvm-project/commit/255a507716bca63a375f3b8a379ccbbc58cb40da.diff

LOG: [NFC][InstructionCost] Use InstructionCost in 
lib/Transforms/IPO/IROutliner.cpp

In places where we call a TTI.getXXCost() function I have changed
the code to use InstructionCost instead of unsigned. This is in
preparation for later on when we will change the TTI interfaces
to return InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: 
http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Differential Revision: https://reviews.llvm.org/D94427

Added: 


Modified: 
llvm/include/llvm/Transforms/IPO/IROutliner.h
llvm/lib/Transforms/IPO/IROutliner.cpp

Removed: 




diff  --git a/llvm/include/llvm/Transforms/IPO/IROutliner.h 
b/llvm/include/llvm/Transforms/IPO/IROutliner.h
index 0346803e9ad7..eefcbe5235c1 100644
--- a/llvm/include/llvm/Transforms/IPO/IROutliner.h
+++ b/llvm/include/llvm/Transforms/IPO/IROutliner.h
@@ -44,6 +44,7 @@
 #include "llvm/Analysis/IRSimilarityIdentifier.h"
 #include "llvm/IR/PassManager.h"
 #include "llvm/IR/ValueMap.h"
+#include "llvm/Support/InstructionCost.h"
 #include "llvm/Transforms/Utils/CodeExtractor.h"
 #include 
 
@@ -150,7 +151,7 @@ struct OutlinableRegion {
   ///
   /// \param [in] TTI - The TargetTransformInfo for the parent function.
   /// \returns the code size of the region
-  unsigned getBenefit(TargetTransformInfo &TTI);
+  InstructionCost getBenefit(TargetTransformInfo &TTI);
 };
 
 /// This class is a pass that identifies similarity in a Module, extracts
@@ -214,14 +215,14 @@ class IROutliner {
   /// \param [in] CurrentGroup - The collection of OutlinableRegions to be
   /// analyzed.
   /// \returns the number of outlined instructions across all regions.
-  unsigned findBenefitFromAllRegions(OutlinableGroup &CurrentGroup);
+  InstructionCost findBenefitFromAllRegions(OutlinableGroup &CurrentGroup);
 
   /// Find the number of instructions that will be added by reloading 
arguments.
   ///
   /// \param [in] CurrentGroup - The collection of OutlinableRegions to be
   /// analyzed.
   /// \returns the number of added reload instructions across all regions.
-  unsigned findCostOutputReloads(OutlinableGroup &CurrentGroup);
+  InstructionCost findCostOutputReloads(OutlinableGroup &CurrentGroup);
 
   /// Find the cost and the benefit of \p CurrentGroup and save it back to
   /// \p CurrentGroup.

diff  --git a/llvm/lib/Transforms/IPO/IROutliner.cpp 
b/llvm/lib/Transforms/IPO/IROutliner.cpp
index 909e26b9a6e1..4b6a4f3d8fc4 100644
--- a/llvm/lib/Transforms/IPO/IROutliner.cpp
+++ b/llvm/lib/Transforms/IPO/IROutliner.cpp
@@ -86,10 +86,10 @@ struct OutlinableGroup {
 
   /// The number of instructions that will be outlined by extracting \ref
   /// Regions.
-  unsigned Benefit = 0;
+  InstructionCost Benefit = 0;
   /// The number of added instructions needed for the outlining of the \ref
   /// Regions.
-  unsigned Cost = 0;
+  InstructionCost Cost = 0;
 
   /// The argument that needs to be marked with the swifterr attribute.  If not
   /// needed, there is no value.
@@ -243,8 +243,8 @@ constantMatches(Value *V, unsigned GVN,
   return false;
 }
 
-unsigned OutlinableRegion::getBenefit(TargetTransformInfo &TTI) {
-  InstructionCost Benefit(0);
+InstructionCost OutlinableRegion::getBenefit(TargetTransformInfo &TTI) {
+  InstructionCost Benefit = 0;
 
   // Estimate the benefit of outlining a specific sections of the program.  We
   // delegate mostly this task to the TargetTransformInfo so that if the target
@@ -274,7 +274,7 @@ unsigned OutlinableRegion::getBenefit(TargetTransformInfo 
&TTI) {
 }
   }
 
-  return *Benefit.getValue();
+  return Benefit;
 }
 
 /// Find whether \p Region matches the global value numbering to Constant
@@ -1287,8 +1287,9 @@ void IROutliner::pruneIncompatibleRegions(
   }
 }
 
-unsigned IROutliner::findBenefitFromAllRegions(OutlinableGroup &CurrentGroup) {
-  unsigned RegionBenefit = 0;
+InstructionCost
+IROutliner::findBenefitFromAllRegions(OutlinableGroup &CurrentGroup) {
+  InstructionCost RegionBenefit = 0;
   for (OutlinableRegion *Region : CurrentGroup.Regions) {
 TargetTransformInfo &TTI = getTTI(*Region->StartBB->getParent());
 // We add the number of instructions in the region to the benefit as an
@@ -1301,8 +1302,9 @@ unsigned 
IROutliner::findBenefitFromAllRegions(OutlinableGroup &CurrentGroup) {
   return RegionBenefit;
 }
 
-unsigned IROutliner::findCostOutputReloads(OutlinableGroup &CurrentGroup) {
-  unsigned OverallCost = 0;
+InstructionCost
+IROutliner::findCostOutputReloads(OutlinableGroup &CurrentGroup) {
+  InstructionCost OverallCo

[llvm-branch-commits] [llvm] 2e080eb - [SVE] Add support for scalable vectorization of loops with selects and cmps

2021-01-22 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2021-01-22T09:48:13Z
New Revision: 2e080eb00ad76654313e0e119bb7fa0ffe2f9866

URL: 
https://github.com/llvm/llvm-project/commit/2e080eb00ad76654313e0e119bb7fa0ffe2f9866
DIFF: 
https://github.com/llvm/llvm-project/commit/2e080eb00ad76654313e0e119bb7fa0ffe2f9866.diff

LOG: [SVE] Add support for scalable vectorization of loops with selects and cmps

I have removed an unnecessary assert in 
LoopVectorizationCostModel::getInstructionCost
that prevented a cost being calculated for select instructions when using
scalable vectors. In addition, I have changed AArch64TTIImpl::getCmpSelInstrCost
to only do special cost calculations for fixed width vectors and fall
back to the base version for scalable vectors.

I have added a simple cost model test for cmps and selects:

  test/Analysis/CostModel/sve-cmpsel.ll

and some simple tests that show we vectorize loops with cmp and select:

  test/Transforms/LoopVectorize/AArch64/sve-basic-vec.ll

Differential Revision: https://reviews.llvm.org/D95039

Added: 
llvm/test/Analysis/CostModel/sve-cmpsel.ll
llvm/test/Transforms/LoopVectorize/AArch64/sve-basic-vec.ll

Modified: 
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp 
b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index ffa045846e59..7fda6b8fb602 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -707,7 +707,7 @@ int AArch64TTIImpl::getCmpSelInstrCost(unsigned Opcode, 
Type *ValTy,
   int ISD = TLI->InstructionOpcodeToISD(Opcode);
   // We don't lower some vector selects well that are wider than the register
   // width.
-  if (ValTy->isVectorTy() && ISD == ISD::SELECT) {
+  if (isa(ValTy) && ISD == ISD::SELECT) {
 // We would need this many instructions to hide the scalarization 
happening.
 const int AmortizationCost = 20;
 
@@ -749,6 +749,8 @@ int AArch64TTIImpl::getCmpSelInstrCost(unsigned Opcode, 
Type *ValTy,
 return Entry->Cost;
 }
   }
+  // The base case handles scalable vectors fine for now, since it treats the
+  // cost as 1 * legalization cost.
   return BaseT::getCmpSelInstrCost(Opcode, ValTy, CondTy, VecPred, CostKind, 
I);
 }
 

diff  --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 1bc4afeae5f9..9e157f3061b6 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7334,10 +7334,8 @@ 
LoopVectorizationCostModel::getInstructionCost(Instruction *I, ElementCount VF,
 const SCEV *CondSCEV = SE->getSCEV(SI->getCondition());
 bool ScalarCond = (SE->isLoopInvariant(CondSCEV, TheLoop));
 Type *CondTy = SI->getCondition()->getType();
-if (!ScalarCond) {
-  assert(!VF.isScalable() && "VF is assumed to be non scalable.");
+if (!ScalarCond)
   CondTy = VectorType::get(CondTy, VF);
-}
 return TTI.getCmpSelInstrCost(I->getOpcode(), VectorTy, CondTy,
   CmpInst::BAD_ICMP_PREDICATE, CostKind, I);
   }

diff  --git a/llvm/test/Analysis/CostModel/sve-cmpsel.ll 
b/llvm/test/Analysis/CostModel/sve-cmpsel.ll
new file mode 100644
index ..163c863c1ea3
--- /dev/null
+++ b/llvm/test/Analysis/CostModel/sve-cmpsel.ll
@@ -0,0 +1,146 @@
+; RUN: opt -cost-model -analyze -mtriple=aarch64--linux-gnu -mattr=+sve  < %s 
2>%t | FileCheck %s
+
+; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t
+
+; If this check fails please read test/CodeGen/AArch64/README for instructions 
on how to resolve it.
+; WARN-NOT: warning
+
+; Check icmp for legal integer vectors.
+define void @cmp_legal_int() {
+; CHECK-LABEL: 'cmp_legal_int'
+; CHECK: Cost Model: Found an estimated cost of 1 for instruction:   %1 = icmp 
ne  undef, undef
+; CHECK: Cost Model: Found an estimated cost of 1 for instruction:   %2 = icmp 
ne  undef, undef
+; CHECK: Cost Model: Found an estimated cost of 1 for instruction:   %3 = icmp 
ne  undef, undef
+; CHECK: Cost Model: Found an estimated cost of 1 for instruction:   %4 = icmp 
ne  undef, undef
+  %1 = icmp ne  undef, undef
+  %2 = icmp ne  undef, undef
+  %3 = icmp ne  undef, undef
+  %4 = icmp ne  undef, undef
+  ret void
+}
+
+; Check icmp for an illegal integer vector.
+define  @cmp_nxv4i64() {
+; CHECK-LABEL: 'cmp_nxv4i64'
+; CHECK: Cost Model: Found an estimated cost of 2 for instruction:   %res = 
icmp ne  undef, undef
+; CHECK: Cost Model: Found an estimated cost of 0 for instruction:   ret 
 %res
+  %res = icmp ne  undef, undef
+  ret  %res
+}
+
+; Check icmp for legal predicate vectors.
+define void @cmp_legal_pred() {
+; CHECK-LABEL: 'cmp_legal_pred'
+; CHECK: Cost Model: Found an estimated cost of 1 for instruction:   %1 = icmp 
ne  und

[llvm-branch-commits] [llvm] 83e7a96 - Fix build failure caused by 2e080eb00ad76654313e0e119bb7fa0ffe2f9866

2021-01-22 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2021-01-22T09:56:53Z
New Revision: 83e7a96c06835eb37416ffdc463edc7ddd18656c

URL: 
https://github.com/llvm/llvm-project/commit/83e7a96c06835eb37416ffdc463edc7ddd18656c
DIFF: 
https://github.com/llvm/llvm-project/commit/83e7a96c06835eb37416ffdc463edc7ddd18656c.diff

LOG: Fix build failure caused by 2e080eb00ad76654313e0e119bb7fa0ffe2f9866

Added: 
llvm/test/Analysis/CostModel/AArch64/sve-cmpsel.ll

Modified: 


Removed: 
llvm/test/Analysis/CostModel/sve-cmpsel.ll



diff  --git a/llvm/test/Analysis/CostModel/sve-cmpsel.ll 
b/llvm/test/Analysis/CostModel/AArch64/sve-cmpsel.ll
similarity index 100%
rename from llvm/test/Analysis/CostModel/sve-cmpsel.ll
rename to llvm/test/Analysis/CostModel/AArch64/sve-cmpsel.ll



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] a650920 - [SVE] Fix inline assembly parsing crash

2021-01-04 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2021-01-04T09:11:05Z
New Revision: a65092040ad4fefcdad18382781090839cad3b67

URL: 
https://github.com/llvm/llvm-project/commit/a65092040ad4fefcdad18382781090839cad3b67
DIFF: 
https://github.com/llvm/llvm-project/commit/a65092040ad4fefcdad18382781090839cad3b67.diff

LOG: [SVE] Fix inline assembly parsing crash

This patch fixes a crash encountered when compiling this code:

  ...
  float16_t a;
  __asm__("fminv %h[a], %[b], %[c].h"
  : [a] "=r" (a)
  : [b] "Upl" (b), [c] "w" (c))

The issue here is when using the 'h' modifier for a register
constraint 'r'.

Differential Revision: https://reviews.llvm.org/D93537

Added: 


Modified: 
llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
llvm/test/CodeGen/AArch64/inline-asm-constraints-bad-sve.ll

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp 
b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index c18e9a4e6db1..c7fa49c965a8 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -647,7 +647,8 @@ bool AArch64AsmPrinter::printAsmRegInClass(const 
MachineOperand &MO,
   const TargetRegisterInfo *RI = STI->getRegisterInfo();
   Register Reg = MO.getReg();
   unsigned RegToPrint = RC->getRegister(RI->getEncodingValue(Reg));
-  assert(RI->regsOverlap(RegToPrint, Reg));
+  if (!RI->regsOverlap(RegToPrint, Reg))
+return true;
   O << AArch64InstPrinter::getRegisterName(RegToPrint, AltName);
   return false;
 }

diff  --git a/llvm/test/CodeGen/AArch64/inline-asm-constraints-bad-sve.ll 
b/llvm/test/CodeGen/AArch64/inline-asm-constraints-bad-sve.ll
index 5a2f4746af87..aa25d118c9b5 100644
--- a/llvm/test/CodeGen/AArch64/inline-asm-constraints-bad-sve.ll
+++ b/llvm/test/CodeGen/AArch64/inline-asm-constraints-bad-sve.ll
@@ -6,6 +6,7 @@ target triple = "aarch64-unknown-linux-gnu"
 ; CHECK: error: couldn't allocate input reg for constraint 'Upa'
 ; CHECK: error: couldn't allocate input reg for constraint 'r'
 ; CHECK: error: couldn't allocate output register for constraint 'w'
+; CHECK: error: unknown token in expression
 
 define  @foo1(i32 *%in) {
 entry:
@@ -27,3 +28,11 @@ entry:
   %1 = call  asm sideeffect "mov $0.b, $1.b \0A", 
"=&w,w"( %0)
   ret  %1
 }
+
+define half @foo4( *%inp,  *%inv) {
+entry:
+  %0 = load , * %inp, align 2
+  %1 = load , * %inv, align 16
+  %2 = call half asm "fminv ${0:h}, $1, $2.h", "=r,@3Upl,w"( 
%0,  %1)
+  ret half %2
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] d1bf26f - [AArch64][SVE] Add lowering for llvm abs intrinsic

2021-01-08 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2021-01-08T08:55:25Z
New Revision: d1bf26fd943e39a4e3bb55bdaeec5559e74dee99

URL: 
https://github.com/llvm/llvm-project/commit/d1bf26fd943e39a4e3bb55bdaeec5559e74dee99
DIFF: 
https://github.com/llvm/llvm-project/commit/d1bf26fd943e39a4e3bb55bdaeec5559e74dee99.diff

LOG: [AArch64][SVE] Add lowering for llvm abs intrinsic

Add functionality to permit lowering of the abs and neg intrinsics
using the passthru variants.

Differential Revision: https://reviews.llvm.org/D94160

Added: 


Modified: 
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/lib/Target/AArch64/AArch64ISelLowering.h
llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
llvm/lib/Target/AArch64/SVEInstrFormats.td
llvm/test/CodeGen/AArch64/sve-fixed-length-int-arith.ll
llvm/test/CodeGen/AArch64/sve-int-arith.ll

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index fdf3acfe68c5..926d952425d0 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -187,6 +187,8 @@ static bool isMergePassthruOpcode(unsigned Opc) {
   case AArch64ISD::CTLZ_MERGE_PASSTHRU:
   case AArch64ISD::CTPOP_MERGE_PASSTHRU:
   case AArch64ISD::DUP_MERGE_PASSTHRU:
+  case AArch64ISD::ABS_MERGE_PASSTHRU:
+  case AArch64ISD::NEG_MERGE_PASSTHRU:
   case AArch64ISD::FNEG_MERGE_PASSTHRU:
   case AArch64ISD::SIGN_EXTEND_INREG_MERGE_PASSTHRU:
   case AArch64ISD::ZERO_EXTEND_INREG_MERGE_PASSTHRU:
@@ -1097,6 +1099,7 @@ AArch64TargetLowering::AArch64TargetLowering(const 
TargetMachine &TM,
   setOperationAction(ISD::SHL, VT, Custom);
   setOperationAction(ISD::SRL, VT, Custom);
   setOperationAction(ISD::SRA, VT, Custom);
+  setOperationAction(ISD::ABS, VT, Custom);
   setOperationAction(ISD::VECREDUCE_ADD, VT, Custom);
   setOperationAction(ISD::VECREDUCE_AND, VT, Custom);
   setOperationAction(ISD::VECREDUCE_OR, VT, Custom);
@@ -1345,6 +1348,7 @@ void AArch64TargetLowering::addTypeForFixedLengthSVE(MVT 
VT) {
   setOperationAction(ISD::EXTRACT_SUBVECTOR, VT, Custom);
 
   // Lower fixed length vector operations to scalable equivalents.
+  setOperationAction(ISD::ABS, VT, Custom);
   setOperationAction(ISD::ADD, VT, Custom);
   setOperationAction(ISD::AND, VT, Custom);
   setOperationAction(ISD::ANY_EXTEND, VT, Custom);
@@ -1743,6 +1747,8 @@ const char 
*AArch64TargetLowering::getTargetNodeName(unsigned Opcode) const {
 MAKE_CASE(AArch64ISD::FSQRT_MERGE_PASSTHRU)
 MAKE_CASE(AArch64ISD::FRECPX_MERGE_PASSTHRU)
 MAKE_CASE(AArch64ISD::FABS_MERGE_PASSTHRU)
+MAKE_CASE(AArch64ISD::ABS_MERGE_PASSTHRU)
+MAKE_CASE(AArch64ISD::NEG_MERGE_PASSTHRU)
 MAKE_CASE(AArch64ISD::SETCC_MERGE_ZERO)
 MAKE_CASE(AArch64ISD::ADC)
 MAKE_CASE(AArch64ISD::SBC)
@@ -3661,6 +3667,12 @@ SDValue 
AArch64TargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op,
   case Intrinsic::aarch64_sve_fabs:
 return DAG.getNode(AArch64ISD::FABS_MERGE_PASSTHRU, dl, Op.getValueType(),
Op.getOperand(2), Op.getOperand(3), Op.getOperand(1));
+  case Intrinsic::aarch64_sve_abs:
+return DAG.getNode(AArch64ISD::ABS_MERGE_PASSTHRU, dl, Op.getValueType(),
+   Op.getOperand(2), Op.getOperand(3), Op.getOperand(1));
+  case Intrinsic::aarch64_sve_neg:
+return DAG.getNode(AArch64ISD::NEG_MERGE_PASSTHRU, dl, Op.getValueType(),
+   Op.getOperand(2), Op.getOperand(3), Op.getOperand(1));
   case Intrinsic::aarch64_sve_convert_to_svbool: {
 EVT OutVT = Op.getValueType();
 EVT InVT = Op.getOperand(1).getValueType();
@@ -4163,9 +4175,12 @@ SDValue AArch64TargetLowering::LowerSTORE(SDValue Op,
 }
 
 // Generate SUBS and CSEL for integer abs.
-static SDValue LowerABS(SDValue Op, SelectionDAG &DAG) {
+SDValue AArch64TargetLowering::LowerABS(SDValue Op, SelectionDAG &DAG) const {
   MVT VT = Op.getSimpleValueType();
 
+  if (VT.isVector())
+return LowerToPredicatedOp(Op, DAG, AArch64ISD::ABS_MERGE_PASSTHRU);
+
   SDLoc DL(Op);
   SDValue Neg = DAG.getNode(ISD::SUB, DL, VT, DAG.getConstant(0, DL, VT),
 Op.getOperand(0));

diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index 96aaf40250e5..23d5ce91b3e3 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -114,6 +114,8 @@ enum NodeType : unsigned {
   FCVTZS_MERGE_PASSTHRU,
   SIGN_EXTEND_INREG_MERGE_PASSTHRU,
   ZERO_EXTEND_INREG_MERGE_PASSTHRU,
+  ABS_MERGE_PASSTHRU,
+  NEG_MERGE_PASSTHRU,
 
   SETCC_MERGE_ZERO,
 
@@ -812,6 +814,7 @@ class AArch64TargetLowering : public TargetLowering {
   SDValue ThisVal) const;
 
   SDValue LowerSTORE(SDValue Op, SelectionDAG &DAG) const;
+  SDValue LowerABS(SDValue Op, SelectionDAG &DAG

[llvm-branch-commits] [clang] 38d18d9 - [SVE] Add support to vectorize_width loop pragma for scalable vectors

2021-01-08 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2021-01-08T11:37:27Z
New Revision: 38d18d93534d290d045bbbfa86337e70f1139dc2

URL: 
https://github.com/llvm/llvm-project/commit/38d18d93534d290d045bbbfa86337e70f1139dc2
DIFF: 
https://github.com/llvm/llvm-project/commit/38d18d93534d290d045bbbfa86337e70f1139dc2.diff

LOG: [SVE] Add support to vectorize_width loop pragma for scalable vectors

This patch adds support for two new variants of the vectorize_width
pragma:

1. vectorize_width(X[, fixed|scalable]) where an optional second
parameter is passed to the vectorize_width pragma, which indicates if
the user wishes to use fixed width or scalable vectorization. For
example the user can now write something like:

  #pragma clang loop vectorize_width(4, fixed)
or
  #pragma clang loop vectorize_width(4, scalable)

In the absence of a second parameter it is assumed the user wants
fixed width vectorization, in order to maintain compatibility with
existing code.
2. vectorize_width(fixed|scalable) where the width is left unspecified,
but the user hints what type of vectorization they prefer, either
fixed width or scalable.

I have implemented this by making use of the LLVM loop hint attribute:

  llvm.loop.vectorize.scalable.enable

Tests were added to

  clang/test/CodeGenCXX/pragma-loop.cpp

for both the 'fixed' and 'scalable' optional parameter.

See this thread for context: 
http://lists.llvm.org/pipermail/cfe-dev/2020-November/067262.html

Differential Revision: https://reviews.llvm.org/D89031

Added: 


Modified: 
clang/docs/LanguageExtensions.rst
clang/include/clang/Basic/Attr.td
clang/include/clang/Basic/DiagnosticParseKinds.td
clang/lib/AST/AttrImpl.cpp
clang/lib/CodeGen/CGLoopInfo.cpp
clang/lib/CodeGen/CGLoopInfo.h
clang/lib/Parse/ParsePragma.cpp
clang/lib/Sema/SemaStmtAttr.cpp
clang/test/AST/ast-print-pragmas.cpp
clang/test/CodeGenCXX/pragma-loop-pr27643.cpp
clang/test/CodeGenCXX/pragma-loop.cpp
clang/test/Parser/pragma-loop.cpp

Removed: 




diff  --git a/clang/docs/LanguageExtensions.rst 
b/clang/docs/LanguageExtensions.rst
index 0c01a2bbc52b..6fa6c55b15fc 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -3107,8 +3107,18 @@ manually enable vectorization or interleaving.
 ...
   }
 
-The vector width is specified by ``vectorize_width(_value_)`` and the 
interleave
-count is specified by ``interleave_count(_value_)``, where
+The vector width is specified by
+``vectorize_width(_value_[, fixed|scalable])``, where _value_ is a positive
+integer and the type of vectorization can be specified with an optional
+second parameter. The default for the second parameter is 'fixed' and
+refers to fixed width vectorization, whereas 'scalable' indicates the
+compiler should use scalable vectors instead. Another use of vectorize_width
+is ``vectorize_width(fixed|scalable)`` where the user can hint at the type
+of vectorization to use without specifying the exact width. In both variants
+of the pragma the vectorizer may decide to fall back on fixed width
+vectorization if the target does not support scalable vectors.
+
+The interleave count is specified by ``interleave_count(_value_)``, where
 _value_ is a positive integer. This is useful for specifying the optimal
 width/count of the set of target architectures supported by your application.
 

diff  --git a/clang/include/clang/Basic/Attr.td 
b/clang/include/clang/Basic/Attr.td
index b84e6a14f371..248409946123 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -3375,8 +3375,10 @@ def LoopHint : Attr {
"PipelineDisabled", "PipelineInitiationInterval", 
"Distribute",
"VectorizePredicate"]>,
   EnumArgument<"State", "LoopHintState",
-   ["enable", "disable", "numeric", "assume_safety", 
"full"],
-   ["Enable", "Disable", "Numeric", "AssumeSafety", 
"Full"]>,
+   ["enable", "disable", "numeric", "fixed_width",
+"scalable_width", "assume_safety", "full"],
+   ["Enable", "Disable", "Numeric", "FixedWidth",
+"ScalableWidth", "AssumeSafety", "Full"]>,
   ExprArgument<"Value">];
 
   let AdditionalMembers = [{

diff  --git a/clang/include/clang/Basic/DiagnosticParseKinds.td 
b/clang/include/clang/Basic/DiagnosticParseKinds.td
index 8f78bbfc4e70..0ed80a481e78 100644
--- a/clang/include/clang/Basic/DiagnosticParseKinds.td
+++ b/clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1396,6 +1396,12 @@ def err_pragma_loop_invalid_option : Error<
   "%select{invalid|missing}0 option%select{ %1|}0; expected vectorize, "
   "vectorize_width, interleave, interleave_count, unroll, unroll_count, "
   "pipeline, pipeline_initiation_interval, vectorize_predicate, or 

[llvm-branch-commits] [llvm] b7ccaca - [NFC] Remove min/max functions from InstructionCost

2021-01-11 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2021-01-11T09:00:12Z
New Revision: b7ccaca53700fce21b0e8e5d7bd2a956bd391fee

URL: 
https://github.com/llvm/llvm-project/commit/b7ccaca53700fce21b0e8e5d7bd2a956bd391fee
DIFF: 
https://github.com/llvm/llvm-project/commit/b7ccaca53700fce21b0e8e5d7bd2a956bd391fee.diff

LOG: [NFC] Remove min/max functions from InstructionCost

Removed the InstructionCost::min/max functions because it's
fine to use std::min/max instead.

Differential Revision: https://reviews.llvm.org/D94301

Added: 


Modified: 
llvm/include/llvm/Support/InstructionCost.h
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
llvm/unittests/Support/InstructionCostTest.cpp

Removed: 




diff  --git a/llvm/include/llvm/Support/InstructionCost.h 
b/llvm/include/llvm/Support/InstructionCost.h
index fe56d49b4174..725f8495ac09 100644
--- a/llvm/include/llvm/Support/InstructionCost.h
+++ b/llvm/include/llvm/Support/InstructionCost.h
@@ -196,14 +196,6 @@ class InstructionCost {
 return *this >= RHS2;
   }
 
-  static InstructionCost min(InstructionCost LHS, InstructionCost RHS) {
-return LHS < RHS ? LHS : RHS;
-  }
-
-  static InstructionCost max(InstructionCost LHS, InstructionCost RHS) {
-return LHS > RHS ? LHS : RHS;
-  }
-
   void print(raw_ostream &OS) const;
 };
 

diff  --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 5b91495bd844..bd673d112b3a 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -6305,7 +6305,7 @@ bool SLPVectorizerPass::tryToVectorizeList(ArrayRef VL, BoUpSLP &R,
 Cost -= UserCost;
   }
 
-  MinCost = InstructionCost::min(MinCost, Cost);
+  MinCost = std::min(MinCost, Cost);
 
   if (Cost.isValid() && Cost < -SLPCostThreshold) {
 LLVM_DEBUG(dbgs() << "SLP: Vectorizing list at cost:" << Cost << 
".\n");

diff  --git a/llvm/unittests/Support/InstructionCostTest.cpp 
b/llvm/unittests/Support/InstructionCostTest.cpp
index da3d3f47a212..8ba9f990f027 100644
--- a/llvm/unittests/Support/InstructionCostTest.cpp
+++ b/llvm/unittests/Support/InstructionCostTest.cpp
@@ -59,6 +59,6 @@ TEST_F(CostTest, Operators) {
   EXPECT_EQ(*(VThree.getValue()), 3);
   EXPECT_EQ(IThreeA.getValue(), None);
 
-  EXPECT_EQ(InstructionCost::min(VThree, VNegTwo), -2);
-  EXPECT_EQ(InstructionCost::max(VThree, VSix), 6);
+  EXPECT_EQ(std::min(VThree, VNegTwo), -2);
+  EXPECT_EQ(std::max(VThree, VSix), 6);
 }



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 40abeb1 - [NFC][InstructionCost] Change LoopVectorizationCostModel::getInstructionCost to return InstructionCost

2021-01-11 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2021-01-11T09:22:37Z
New Revision: 40abeb11f4584e8a07163d6c7e24011ac45f104c

URL: 
https://github.com/llvm/llvm-project/commit/40abeb11f4584e8a07163d6c7e24011ac45f104c
DIFF: 
https://github.com/llvm/llvm-project/commit/40abeb11f4584e8a07163d6c7e24011ac45f104c.diff

LOG: [NFC][InstructionCost] Change 
LoopVectorizationCostModel::getInstructionCost to return InstructionCost

This patch is part of a series of patches that migrate integer
instruction costs to use InstructionCost. In the function
selectVectorizationFactor I have simply asserted that the cost
is valid and extracted the value as is. In future we expect
to encounter invalid costs, but we should filter out those
vectorization factors that lead to such invalid costs.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: 
http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Differential Revision: https://reviews.llvm.org/D92178

Added: 


Modified: 
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 180cbb8ef847..e6cadf8f8796 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -130,6 +130,7 @@
 #include "llvm/Support/Compiler.h"
 #include "llvm/Support/Debug.h"
 #include "llvm/Support/ErrorHandling.h"
+#include "llvm/Support/InstructionCost.h"
 #include "llvm/Support/MathExtras.h"
 #include "llvm/Support/raw_ostream.h"
 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
@@ -1635,7 +1636,7 @@ class LoopVectorizationCostModel {
   /// is
   /// false, then all operations will be scalarized (i.e. no vectorization has
   /// actually taken place).
-  using VectorizationCostTy = std::pair;
+  using VectorizationCostTy = std::pair;
 
   /// Returns the expected execution cost. The unit of the cost does
   /// not matter because we use the 'cost' units to compare 
diff erent
@@ -1649,7 +1650,8 @@ class LoopVectorizationCostModel {
 
   /// The cost-computation logic from getInstructionCost which provides
   /// the vector type as an output parameter.
-  unsigned getInstructionCost(Instruction *I, ElementCount VF, Type 
*&VectorTy);
+  InstructionCost getInstructionCost(Instruction *I, ElementCount VF,
+ Type *&VectorTy);
 
   /// Calculate vectorization cost of memory instruction \p I.
   unsigned getMemoryInstructionCost(Instruction *I, ElementCount VF);
@@ -1693,7 +1695,7 @@ class LoopVectorizationCostModel {
   /// A type representing the costs for instructions if they were to be
   /// scalarized rather than vectorized. The entries are Instruction-Cost
   /// pairs.
-  using ScalarCostsTy = DenseMap;
+  using ScalarCostsTy = DenseMap;
 
   /// A set containing all BasicBlocks that are known to present after
   /// vectorization as a predicated block.
@@ -5759,10 +5761,13 @@ 
LoopVectorizationCostModel::selectVectorizationFactor(ElementCount MaxVF) {
   // vectors when the loop has a hint to enable vectorization for a given VF.
   assert(!MaxVF.isScalable() && "scalable vectors not yet supported");
 
-  float Cost = expectedCost(ElementCount::getFixed(1)).first;
-  const float ScalarCost = Cost;
+  InstructionCost ExpectedCost = expectedCost(ElementCount::getFixed(1)).first;
+  LLVM_DEBUG(dbgs() << "LV: Scalar loop costs: " << ExpectedCost << ".\n");
+  assert(ExpectedCost.isValid() && "Unexpected invalid cost for scalar loop");
+
   unsigned Width = 1;
-  LLVM_DEBUG(dbgs() << "LV: Scalar loop costs: " << (int)ScalarCost << ".\n");
+  const float ScalarCost = *ExpectedCost.getValue();
+  float Cost = ScalarCost;
 
   bool ForceVectorization = Hints->getForce() == 
LoopVectorizeHints::FK_Enabled;
   if (ForceVectorization && MaxVF.isVector()) {
@@ -5777,7 +5782,8 @@ 
LoopVectorizationCostModel::selectVectorizationFactor(ElementCount MaxVF) {
 // we need to divide the cost of the vector loops by the width of
 // the vector elements.
 VectorizationCostTy C = expectedCost(ElementCount::getFixed(i));
-float VectorCost = C.first / (float)i;
+assert(C.first.isValid() && "Unexpected invalid cost for vector loop");
+float VectorCost = *C.first.getValue() / (float)i;
 LLVM_DEBUG(dbgs() << "LV: Vector loop of width " << i
   << " costs: " << (int)VectorCost << ".\n");
 if (!C.second && !ForceVectorization) {
@@ -6119,8 +6125,10 @@ unsigned 
LoopVectorizationCostModel::selectInterleaveCount(ElementCount VF,
 
   // If we did not calculate the cost for VF (because the user selected the VF)
   // then we calculate the cost of VF here.
-  if (LoopCost == 0)
-LoopCost = expectedCost(VF).first;
+  if (LoopCost == 0) {
+assert(expectedCost(VF).first.isValid() && "Expected a valid cost");
+L

[llvm-branch-commits] [llvm] 3bf7d47 - [NFC][InstructionCost] Remove isValid() asserts in SLPVectorizer.cpp

2020-12-21 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2020-12-21T09:12:28Z
New Revision: 3bf7d47a977d463940f558259d24d43d76d50e6f

URL: 
https://github.com/llvm/llvm-project/commit/3bf7d47a977d463940f558259d24d43d76d50e6f
DIFF: 
https://github.com/llvm/llvm-project/commit/3bf7d47a977d463940f558259d24d43d76d50e6f.diff

LOG: [NFC][InstructionCost] Remove isValid() asserts in SLPVectorizer.cpp

An earlier patch introduced asserts that the InstructionCost is
valid because at that time the ReuseShuffleCost variable was an
unsigned. However, now that the variable is an InstructionCost
instance the asserts can be removed.

See this thread for context:
http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

See this patch for the introduction of the type:
https://reviews.llvm.org/D91174

Added: 


Modified: 
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 80d510185470..b03fb203c6d7 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -3806,23 +3806,17 @@ InstructionCost BoUpSLP::getEntryCost(TreeEntry *E) {
   if (NeedToShuffleReuses) {
 for (unsigned Idx : E->ReuseShuffleIndices) {
   Instruction *I = cast(VL[Idx]);
-  InstructionCost Cost = TTI->getInstructionCost(I, CostKind);
-  assert(Cost.isValid() && "Invalid instruction cost");
-  ReuseShuffleCost -= *(Cost.getValue());
+  ReuseShuffleCost -= TTI->getInstructionCost(I, CostKind);
 }
 for (Value *V : VL) {
   Instruction *I = cast(V);
-  InstructionCost Cost = TTI->getInstructionCost(I, CostKind);
-  assert(Cost.isValid() && "Invalid instruction cost");
-  ReuseShuffleCost += *(Cost.getValue());
+  ReuseShuffleCost += TTI->getInstructionCost(I, CostKind);
 }
   }
   for (Value *V : VL) {
 Instruction *I = cast(V);
 assert(E->isOpcodeOrAlt(I) && "Unexpected main/alternate opcode");
-InstructionCost Cost = TTI->getInstructionCost(I, CostKind);
-assert(Cost.isValid() && "Invalid instruction cost");
-ScalarCost += *(Cost.getValue());
+ScalarCost += TTI->getInstructionCost(I, CostKind);
   }
   // VecCost is equal to sum of the cost of creating 2 vectors
   // and the cost of creating shuffle.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 71bd59f - [SVE] Add support for scalable vectors with vectorize.scalable.enable loop attribute

2020-12-02 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2020-12-02T13:23:43Z
New Revision: 71bd59f0cb6db0211418b07127e8b311d944e2c2

URL: 
https://github.com/llvm/llvm-project/commit/71bd59f0cb6db0211418b07127e8b311d944e2c2
DIFF: 
https://github.com/llvm/llvm-project/commit/71bd59f0cb6db0211418b07127e8b311d944e2c2.diff

LOG: [SVE] Add support for scalable vectors with vectorize.scalable.enable loop 
attribute

In this patch I have added support for a new loop hint called
vectorize.scalable.enable that says whether we should enable scalable
vectorization or not. If a user wants to instruct the compiler to
vectorize a loop with scalable vectors they can now do this as
follows:

  br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !2
  ...
  !2 = !{!2, !3, !4}
  !3 = !{!"llvm.loop.vectorize.width", i32 8}
  !4 = !{!"llvm.loop.vectorize.scalable.enable", i1 true}

Setting the hint to false simply reverts the behaviour back to the
default, using fixed width vectors.

Differential Revision: https://reviews.llvm.org/D88962

Added: 
llvm/test/Transforms/LoopVectorize/no_array_bounds_scalable.ll

Modified: 
llvm/docs/LangRef.rst
llvm/include/llvm/Transforms/Utils/LoopUtils.h
llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
llvm/lib/Transforms/Scalar/WarnMissedTransforms.cpp
llvm/lib/Transforms/Utils/LoopUtils.cpp
llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
llvm/test/Transforms/LoopVectorize/metadata-width.ll

Removed: 




diff  --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 1964f2416b8f..03a8387d9d1f 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -5956,6 +5956,21 @@ vectorization:
!0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0}
!1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1}
 
+'``llvm.loop.vectorize.scalable.enable``' Metadata
+^^^
+
+This metadata selectively enables or disables scalable vectorization for the
+loop, and only has any effect if vectorization for the loop is already enabled.
+The first operand is the string ``llvm.loop.vectorize.scalable.enable``
+and the second operand is a bit. If the bit operand value is 1 scalable
+vectorization is enabled, whereas a value of 0 reverts to the default fixed
+width vectorization:
+
+.. code-block:: llvm
+
+   !0 = !{!"llvm.loop.vectorize.scalable.enable", i1 0}
+   !1 = !{!"llvm.loop.vectorize.scalable.enable", i1 1}
+
 '``llvm.loop.vectorize.width``' Metadata
 
 

diff  --git a/llvm/include/llvm/Transforms/Utils/LoopUtils.h 
b/llvm/include/llvm/Transforms/Utils/LoopUtils.h
index 360e262e8ae0..ef348ed56129 100644
--- a/llvm/include/llvm/Transforms/Utils/LoopUtils.h
+++ b/llvm/include/llvm/Transforms/Utils/LoopUtils.h
@@ -213,6 +213,13 @@ Optional 
findStringMetadataForLoop(const Loop *TheLoop,
 /// Find named metadata for a loop with an integer value.
 llvm::Optional getOptionalIntLoopAttribute(Loop *TheLoop, StringRef Name);
 
+/// Find a combination of metadata ("llvm.loop.vectorize.width" and
+/// "llvm.loop.vectorize.scalable.enable") for a loop and use it to construct a
+/// ElementCount. If the metadata "llvm.loop.vectorize.width" cannot be found
+/// then None is returned.
+Optional
+getOptionalElementCountLoopAttribute(Loop *TheLoop);
+
 /// Create a new loop identifier for a loop created from a loop transformation.
 ///
 /// @param OrigLoopID The loop ID of the loop before the transformation.

diff  --git 
a/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h 
b/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
index 2be9ef10ac4f..f701e08961a0 100644
--- a/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
+++ b/llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h
@@ -29,6 +29,7 @@
 #include "llvm/ADT/MapVector.h"
 #include "llvm/Analysis/LoopAccessAnalysis.h"
 #include "llvm/Analysis/OptimizationRemarkEmitter.h"
+#include "llvm/Support/TypeSize.h"
 #include "llvm/Transforms/Utils/LoopUtils.h"
 
 namespace llvm {
@@ -43,8 +44,14 @@ namespace llvm {
 /// for example 'force', means a decision has been made. So, we need to be
 /// careful NOT to add them if the user hasn't specifically asked so.
 class LoopVectorizeHints {
-  enum HintKind { HK_WIDTH, HK_UNROLL, HK_FORCE, HK_ISVECTORIZED,
-  HK_PREDICATE };
+  enum HintKind {
+HK_WIDTH,
+HK_UNROLL,
+HK_FORCE,
+HK_ISVECTORIZED,
+HK_PREDICATE,
+HK_SCALABLE
+  };
 
   /// Hint - associates name and validation with the hint value.
   struct Hint {
@@ -73,6 +80,9 @@ class LoopVectorizeHints {
   /// Vector Predicate
   Hint Predicate;
 
+  /// Says whether we should use fixed width or scalable vectorization.
+  Hint Scalable;
+
   /// Return the loop metadata prefix.
   static StringRef Prefi

[llvm-branch-commits] [llvm] 59f17b5 - [SVE] Fix crashes with inline assembly

2020-12-08 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2020-12-08T13:48:43Z
New Revision: 59f17b57d9c9abf86d8dcc05c49d3bbd807e29c7

URL: 
https://github.com/llvm/llvm-project/commit/59f17b57d9c9abf86d8dcc05c49d3bbd807e29c7
DIFF: 
https://github.com/llvm/llvm-project/commit/59f17b57d9c9abf86d8dcc05c49d3bbd807e29c7.diff

LOG: [SVE] Fix crashes with inline assembly

All the crashes found compiling inline assembly are fixed in this
patch by changing AArch64TargetLowering::getRegForInlineAsmConstraint
to be more resilient to mismatched value and register types. For
example, it makes no sense to request a predicate register for
a nxv2i64 type and so on.

Tests have been added here:

  test/CodeGen/AArch64/inline-asm-constraints-bad-sve.ll

Differential Revision: https://reviews.llvm.org/D92554

Added: 
llvm/test/CodeGen/AArch64/inline-asm-constraints-bad-sve.ll

Modified: 
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index cca31a701d56..700c281cdaa9 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -7511,23 +7511,30 @@ AArch64TargetLowering::getRegForInlineAsmConstraint(
   if (Constraint.size() == 1) {
 switch (Constraint[0]) {
 case 'r':
-  if (VT.getSizeInBits() == 64)
+  if (VT.isScalableVector())
+return std::make_pair(0U, nullptr);
+  if (VT.getFixedSizeInBits() == 64)
 return std::make_pair(0U, &AArch64::GPR64commonRegClass);
   return std::make_pair(0U, &AArch64::GPR32commonRegClass);
-case 'w':
+case 'w': {
   if (!Subtarget->hasFPARMv8())
 break;
-  if (VT.isScalableVector())
-return std::make_pair(0U, &AArch64::ZPRRegClass);
-  if (VT.getSizeInBits() == 16)
+  if (VT.isScalableVector()) {
+if (VT.getVectorElementType() != MVT::i1)
+  return std::make_pair(0U, &AArch64::ZPRRegClass);
+return std::make_pair(0U, nullptr);
+  }
+  uint64_t VTSize = VT.getFixedSizeInBits();
+  if (VTSize == 16)
 return std::make_pair(0U, &AArch64::FPR16RegClass);
-  if (VT.getSizeInBits() == 32)
+  if (VTSize == 32)
 return std::make_pair(0U, &AArch64::FPR32RegClass);
-  if (VT.getSizeInBits() == 64)
+  if (VTSize == 64)
 return std::make_pair(0U, &AArch64::FPR64RegClass);
-  if (VT.getSizeInBits() == 128)
+  if (VTSize == 128)
 return std::make_pair(0U, &AArch64::FPR128RegClass);
   break;
+}
 // The instructions that this constraint is designed for can
 // only take 128-bit registers so just use that regclass.
 case 'x':
@@ -7548,10 +7555,11 @@ AArch64TargetLowering::getRegForInlineAsmConstraint(
   } else {
 PredicateConstraint PC = parsePredicateConstraint(Constraint);
 if (PC != PredicateConstraint::Invalid) {
-  assert(VT.isScalableVector());
+  if (!VT.isScalableVector() || VT.getVectorElementType() != MVT::i1)
+return std::make_pair(0U, nullptr);
   bool restricted = (PC == PredicateConstraint::Upl);
   return restricted ? std::make_pair(0U, &AArch64::PPR_3bRegClass)
-  : std::make_pair(0U, &AArch64::PPRRegClass);
+: std::make_pair(0U, &AArch64::PPRRegClass);
 }
   }
   if (StringRef("{cc}").equals_lower(Constraint))

diff  --git a/llvm/test/CodeGen/AArch64/inline-asm-constraints-bad-sve.ll 
b/llvm/test/CodeGen/AArch64/inline-asm-constraints-bad-sve.ll
new file mode 100644
index ..5a2f4746af87
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/inline-asm-constraints-bad-sve.ll
@@ -0,0 +1,29 @@
+; RUN: not llc -mtriple=aarch64-none-linux-gnu -mattr=+sve -o - %s 2>&1 | 
FileCheck %s
+
+target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
+target triple = "aarch64-unknown-linux-gnu"
+
+; CHECK: error: couldn't allocate input reg for constraint 'Upa'
+; CHECK: error: couldn't allocate input reg for constraint 'r'
+; CHECK: error: couldn't allocate output register for constraint 'w'
+
+define  @foo1(i32 *%in) {
+entry:
+  %0 = load i32, i32* %in, align 4
+  %1 = call  asm sideeffect "mov $0.b, $1.b \0A", 
"=@3Upa,@3Upa"(i32 %0)
+  ret  %1
+}
+
+define  @foo2( *%in) {
+entry:
+  %0 = load , * %in, align 16
+  %1 = call  asm sideeffect "ptrue p0.s, #1 \0Afabs $0.s, 
p0/m, $1.s \0A", "=w,r"( %0)
+  ret  %1
+}
+
+define  @foo3( *%in) {
+entry:
+  %0 = load , * %in, align 2
+  %1 = call  asm sideeffect "mov $0.b, $1.b \0A", 
"=&w,w"( %0)
+  ret  %1
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] e22259f - [SVE] Remove duplicate assert in DAGTypeLegalizer::SplitVecOp_EXTRACT_SUBVECTOR

2020-12-08 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2020-12-08T14:41:14Z
New Revision: e22259fafe5e2f5e0219366ff92bba15ec70ff56

URL: 
https://github.com/llvm/llvm-project/commit/e22259fafe5e2f5e0219366ff92bba15ec70ff56
DIFF: 
https://github.com/llvm/llvm-project/commit/e22259fafe5e2f5e0219366ff92bba15ec70ff56.diff

LOG: [SVE] Remove duplicate assert in 
DAGTypeLegalizer::SplitVecOp_EXTRACT_SUBVECTOR

Added: 


Modified: 
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

Removed: 




diff  --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp 
b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
index 9a0925061105..1525543a60b6 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
@@ -2281,10 +2281,6 @@ SDValue 
DAGTypeLegalizer::SplitVecOp_EXTRACT_SUBVECTOR(SDNode *N) {
   // We know that the extracted result type is legal.
   EVT SubVT = N->getValueType(0);
 
-  if (SubVT.isScalableVector() !=
-  N->getOperand(0).getValueType().isScalableVector())
-report_fatal_error("Extracting fixed from scalable not implemented");
-
   SDValue Idx = N->getOperand(1);
   SDLoc dl(N);
   SDValue Lo, Hi;



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 9b76160 - [Support] Introduce a new InstructionCost class

2020-12-11 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2020-12-11T08:12:54Z
New Revision: 9b76160e53f67008ff21095098129a2949595a06

URL: 
https://github.com/llvm/llvm-project/commit/9b76160e53f67008ff21095098129a2949595a06
DIFF: 
https://github.com/llvm/llvm-project/commit/9b76160e53f67008ff21095098129a2949595a06.diff

LOG: [Support] Introduce a new InstructionCost class

This is the first in a series of patches that attempts to migrate
existing cost instructions to return a new InstructionCost class
in place of a simple integer. This new class is intended to be
as light-weight and simple as possible, with a full range of
arithmetic and comparison operators that largely mirror the same
sets of operations on basic types, such as integers. The main
advantage to using an InstructionCost is that it can encode a
particular cost state in addition to a value. The initial
implementation only has two states - Normal and Invalid - but these
could be expanded over time if necessary. An invalid state can
be used to represent an unknown cost or an instruction that is
prohibitively expensive.

This patch adds the new class and changes the getInstructionCost
interface to return the new class. Other cost functions, such as
getUserCost, etc., will be migrated in future patches as I believe
this to be less disruptive. One benefit of this new class is that
it provides a way to unify many of the magic costs in the codebase
where the cost is set to a deliberately high number to prevent
optimisations taking place, e.g. vectorization. It also provides
a route to represent the extremely high, and unknown, cost of
scalarization of scalable vectors, which is not currently supported.

Differential Revision: https://reviews.llvm.org/D91174

Added: 
llvm/include/llvm/Support/InstructionCost.h
llvm/lib/Support/InstructionCost.cpp
llvm/unittests/Support/InstructionCostTest.cpp

Modified: 
llvm/include/llvm/Analysis/TargetTransformInfo.h
llvm/include/llvm/IR/DiagnosticInfo.h
llvm/lib/Analysis/CostModel.cpp
llvm/lib/CodeGen/InterleavedLoadCombinePass.cpp
llvm/lib/IR/DiagnosticInfo.cpp
llvm/lib/Support/CMakeLists.txt
llvm/lib/Transforms/IPO/HotColdSplitting.cpp
llvm/lib/Transforms/Scalar/CallSiteSplitting.cpp
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
llvm/unittests/Support/CMakeLists.txt

Removed: 




diff  --git a/llvm/include/llvm/Analysis/TargetTransformInfo.h 
b/llvm/include/llvm/Analysis/TargetTransformInfo.h
index af57176401b4..abaf07fad3d4 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfo.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfo.h
@@ -27,6 +27,7 @@
 #include "llvm/Pass.h"
 #include "llvm/Support/AtomicOrdering.h"
 #include "llvm/Support/DataTypes.h"
+#include "llvm/Support/InstructionCost.h"
 #include 
 
 namespace llvm {
@@ -231,19 +232,26 @@ class TargetTransformInfo {
   ///
   /// Note, this method does not cache the cost calculation and it
   /// can be expensive in some cases.
-  int getInstructionCost(const Instruction *I, enum TargetCostKind kind) const 
{
+  InstructionCost getInstructionCost(const Instruction *I,
+ enum TargetCostKind kind) const {
+InstructionCost Cost;
 switch (kind) {
 case TCK_RecipThroughput:
-  return getInstructionThroughput(I);
-
+  Cost = getInstructionThroughput(I);
+  break;
 case TCK_Latency:
-  return getInstructionLatency(I);
-
+  Cost = getInstructionLatency(I);
+  break;
 case TCK_CodeSize:
 case TCK_SizeAndLatency:
-  return getUserCost(I, kind);
+  Cost = getUserCost(I, kind);
+  break;
+default:
+  llvm_unreachable("Unknown instruction cost kind");
 }
-llvm_unreachable("Unknown instruction cost kind");
+if (Cost == -1)
+  Cost.setInvalid();
+return Cost;
   }
 
   /// Underlying constants for 'cost' values in this interface.

diff  --git a/llvm/include/llvm/IR/DiagnosticInfo.h 
b/llvm/include/llvm/IR/DiagnosticInfo.h
index 644d853b9b0d..c457072d50f1 100644
--- a/llvm/include/llvm/IR/DiagnosticInfo.h
+++ b/llvm/include/llvm/IR/DiagnosticInfo.h
@@ -35,6 +35,7 @@ namespace llvm {
 class DiagnosticPrinter;
 class Function;
 class Instruction;
+class InstructionCost;
 class LLVMContext;
 class Module;
 class SMDiagnostic;
@@ -437,6 +438,7 @@ class DiagnosticInfoOptimizationBase : public 
DiagnosticInfoWithLocationBase {
 Argument(StringRef Key, ElementCount EC);
 Argument(StringRef Key, bool B) : Key(Key), Val(B ? "true" : "false") {}
 Argument(StringRef Key, DebugLoc dl);
+Argument(StringRef Key, InstructionCost C);
   };
 
   /// \p PassName is the name of the pass emitting this diagnostic. \p

diff  --git a/llvm/include/llvm/Support/InstructionCost.h 
b/llvm/include/llvm/Support/InstructionCost.h
new file mode 100644
index ..fe56d49b4174
--- /dev/nul

[llvm-branch-commits] [llvm] 616f978 - Fix build issue caused by 9b76160e53f67008ff21095098129a2949595a06

2020-12-11 Thread David Sherwood via llvm-branch-commits

Author: David Sherwood
Date: 2020-12-11T09:43:55Z
New Revision: 616f9781af076942c177abcb7041761924757ea6

URL: 
https://github.com/llvm/llvm-project/commit/616f9781af076942c177abcb7041761924757ea6
DIFF: 
https://github.com/llvm/llvm-project/commit/616f9781af076942c177abcb7041761924757ea6.diff

LOG: Fix build issue caused by 9b76160e53f67008ff21095098129a2949595a06

Added: 


Modified: 
llvm/include/llvm/Analysis/TargetTransformInfo.h

Removed: 




diff  --git a/llvm/include/llvm/Analysis/TargetTransformInfo.h 
b/llvm/include/llvm/Analysis/TargetTransformInfo.h
index abaf07fad3d4..3ba77c9a8dc9 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfo.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfo.h
@@ -246,8 +246,6 @@ class TargetTransformInfo {
 case TCK_SizeAndLatency:
   Cost = getUserCost(I, kind);
   break;
-default:
-  llvm_unreachable("Unknown instruction cost kind");
 }
 if (Cost == -1)
   Cost.setInvalid();



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [MachineLICM] Don't allow hoisting invariant loads across mem barrier. (#116987) (PR #117154)

2024-11-25 Thread David Sherwood via llvm-branch-commits

david-arm wrote:

> @david-arm Should this be merged?

Hi yes I think it should be merged. It's a fairly serious bug fix.

https://github.com/llvm/llvm-project/pull/117154
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [LoopVectorize] Fix cost model assert when vectorising calls (#125716) (PR #126209)

2025-02-07 Thread David Sherwood via llvm-branch-commits

david-arm wrote:

Also needs a build error fix - 3872e55758a5de035c032a975f244302c3ddacc3. Not 
sure the best way to do this - should I backport two commits or create a new PR 
with a joint patch?

https://github.com/llvm/llvm-project/pull/126209
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)

2025-04-10 Thread David Sherwood via llvm-branch-commits


@@ -2,6 +2,7 @@
 ; RUN: opt -passes=loop-vectorize -enable-epilogue-vectorization=false 
-mattr=+neon,+dotprod -force-vector-interleave=1 -S < %s | FileCheck %s 
--check-prefixes=CHECK-INTERLEAVE1
 ; RUN: opt -passes=loop-vectorize -enable-epilogue-vectorization=false 
-mattr=+neon,+dotprod -S < %s | FileCheck %s --check-prefixes=CHECK-INTERLEAVED
 ; RUN: opt -passes=loop-vectorize -enable-epilogue-vectorization=false 
-mattr=+neon,+dotprod -force-vector-interleave=1 -vectorizer-maximize-bandwidth 
-S < %s | FileCheck %s --check-prefixes=CHECK-MAXBW
+; RUN: opt -passes=loop-vectorize -debug-only=loop-vectorize --disable-output 
-S < %s 2>&1 | FileCheck %s --check-prefix=CHECK-REGS

david-arm wrote:

Still missing a `REQUIRES: asserts`

https://github.com/llvm/llvm-project/pull/133090
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)

2025-04-10 Thread David Sherwood via llvm-branch-commits


@@ -46,6 +46,11 @@ define i1 @select_exit_cond(ptr %start, ptr %end, i64 %N) {
 ; CHECK-NEXT:[[STEP_ADD_5:%.*]] = add <2 x i64> [[STEP_ADD_4]], splat (i64 
2)
 ; CHECK-NEXT:[[STEP_ADD_6:%.*]] = add <2 x i64> [[STEP_ADD_5]], splat (i64 
2)
 ; CHECK-NEXT:[[STEP_ADD_7:%.*]] = add <2 x i64> [[STEP_ADD_6]], splat (i64 
2)
+; CHECK-NEXT:[[STEP_ADD_8:%.*]] = add <2 x i64> [[STEP_ADD_7]], splat (i64 
2)

david-arm wrote:

I'm a bit surprised these are the only CHECK lines that have changed.

https://github.com/llvm/llvm-project/pull/133090
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)

2025-04-07 Thread David Sherwood via llvm-branch-commits


@@ -2033,6 +2033,8 @@ class VPReductionPHIRecipe : public VPHeaderPHIRecipe,
   /// Generate the phi/select nodes.
   void execute(VPTransformState &State) override;
 
+  unsigned getVFScaleFactor() const { return VFScaleFactor; }

david-arm wrote:

Perhaps good to have comments on both new functions added?

https://github.com/llvm/llvm-project/pull/133090
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)

2025-04-10 Thread David Sherwood via llvm-branch-commits


@@ -45,9 +45,9 @@ define void @load_and_compare_only_used_by_assume(ptr %a, ptr 
noalias %b) {
 ; CHECK-LABEL: LV: Checking a loop in 'load_and_compare_only_used_by_assume'
 ; CHECK: LV(REG): VF = vscale x 4
 ; CHECK-NEXT: LV(REG): Found max usage: 2 item
-; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 2 registers
-; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 1 registers
-; CHECK-NEXT: LV(REG): Found invariant usage: 0 item
+; CHECK-NEXT: LV(REG): RegisterClass: Generic::ScalarRC, 3 registers
+; CHECK-NEXT: LV(REG): RegisterClass: Generic::VectorRC, 2 registers

david-arm wrote:

Do you know why this has changed? It doesn't look like a partial reduction.

https://github.com/llvm/llvm-project/pull/133090
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Reduce register usage for scaled reductions (PR #133090)

2025-04-10 Thread David Sherwood via llvm-branch-commits


@@ -5039,10 +5039,25 @@ calculateRegisterUsage(VPlan &Plan, 
ArrayRef VFs,
 // even in the scalar case.
 RegUsage[ClassID] += 1;
   } else {
+// The output from scaled phis and scaled reductions actually have
+// fewer lanes than the VF.
+ElementCount VF = VFs[J];
+if (auto *ReductionR = dyn_cast(R))

david-arm wrote:

I realise it may be less efficient, but perhaps it's better to commonise these 
into the same block? If for some reason we need to update this logic in future 
it's easier to fix it only once, i.e.

```
  if (isa(R)) {
auto *ReductionR = dyn_cast(R);
auto *PartialReductionR = dyn_cast(R);
unsigned ScaleFactor = ReductionR ? ReductionR->getVFScaleFactor() : 
PartialReductionR->getVFScaleFactor();
VF = VF.divideCoefficientBy(ScaleFactor);
  }
```

If `getVFScaleFactor` becomes available to a common base class then it should 
simplify further. What do you think?

https://github.com/llvm/llvm-project/pull/133090
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopVectorizer] Bundle partial reductions inside VPMulAccumulateReductionRecipe (PR #136173)

2025-04-29 Thread David Sherwood via llvm-branch-commits




david-arm wrote:

I tried downloading this patch and applying to the HEAD of LLVM and `patch` 
said this diff had already been applied. Does the PR need rebasing?

https://github.com/llvm/llvm-project/pull/136173
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopVectorizer] Bundle partial reductions inside VPMulAccumulateReductionRecipe (PR #136173)

2025-04-29 Thread David Sherwood via llvm-branch-commits




david-arm wrote:

Ah perhaps this is my mistake. You did say it depends upon 
https://github.com/llvm/llvm-project/pull/113903. :)

https://github.com/llvm/llvm-project/pull/136173
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Bundle sub reductions into VPExpressionRecipe (PR #147255)

2025-07-07 Thread David Sherwood via llvm-branch-commits


@@ -5538,7 +5538,7 @@ 
LoopVectorizationCostModel::getReductionPatternCost(Instruction *I,
  TTI::CastContextHint::None, CostKind, RedOp);
 
 InstructionCost RedCost = TTI.getMulAccReductionCost(
-IsUnsigned, RdxDesc.getRecurrenceType(), ExtType, CostKind);
+IsUnsigned, RdxDesc.getRecurrenceType(), ExtType, false, CostKind);

david-arm wrote:

nit: `/*Negated=*/false` and same for other below.

https://github.com/llvm/llvm-project/pull/147255
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Bundle sub reductions into VPExpressionRecipe (PR #147255)

2025-07-07 Thread David Sherwood via llvm-branch-commits


@@ -3116,7 +3116,10 @@ class BasicTTIImplBase : public 
TargetTransformInfoImplCRTPBase {
 
   InstructionCost
   getMulAccReductionCost(bool IsUnsigned, Type *ResTy, VectorType *Ty,
+ bool Negated,
  TTI::TargetCostKind CostKind) const override {
+if (Negated)

david-arm wrote:

Why can't we add a cost for this?

https://github.com/llvm/llvm-project/pull/147255
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Bundle sub reductions into VPExpressionRecipe (PR #147255)

2025-07-07 Thread David Sherwood via llvm-branch-commits


@@ -2757,6 +2757,12 @@ class VPExpressionRecipe : public VPSingleDefRecipe {
 /// vector operands, performing a reduction.add on the result, and adding
 /// the scalar result to a chain.
 MulAccReduction,
+/// Represent an inloop multiply-accumulate reduction, multiplying the
+/// extended vector operands, negating the multiplication, performing a
+/// reduction.add
+/// on the result, and adding

david-arm wrote:

Formatting of the comment looks a bit odd - can you fix it?

https://github.com/llvm/llvm-project/pull/147255
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Bundle sub reductions into VPExpressionRecipe (PR #147255)

2025-07-07 Thread David Sherwood via llvm-branch-commits


@@ -1401,8 +1401,8 @@ static void analyzeCostOfVecReduction(const IntrinsicInst 
&II,
  TTI::CastContextHint::None, CostKind, RedOp);
 
 CostBeforeReduction = ExtCost * 2 + MulCost + Ext2Cost;
-CostAfterReduction =
-TTI.getMulAccReductionCost(IsUnsigned, II.getType(), ExtType, 
CostKind);
+CostAfterReduction = TTI.getMulAccReductionCost(IsUnsigned, II.getType(),
+ExtType, false, CostKind);

david-arm wrote:

nit: Probably better written as `/*Negated=*/false`

https://github.com/llvm/llvm-project/pull/147255
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Bundle sub reductions into VPExpressionRecipe (PR #147255)

2025-07-07 Thread David Sherwood via llvm-branch-commits


@@ -2725,6 +2729,31 @@ void VPExpressionRecipe::print(raw_ostream &O, const 
Twine &Indent,
 O << ")";
 break;
   }
+  case ExpressionTypes::ExtNegatedMulAccReduction: {

david-arm wrote:

Is there a way to commonise this with the ExtMulAccReduction case if the only 
difference is a negate?

https://github.com/llvm/llvm-project/pull/147255
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Bundle sub reductions into VPExpressionRecipe (PR #147255)

2025-07-07 Thread David Sherwood via llvm-branch-commits


@@ -1645,8 +1645,10 @@ class TargetTransformInfo {
   /// extensions. This is the cost of as:
   /// ResTy vecreduce.add(mul (A, B)).
   /// ResTy vecreduce.add(mul(ext(Ty A), ext(Ty B)).
+  /// The multiply can optionally be negated, which signifies that it is a sub
+  /// reduction.
   LLVM_ABI InstructionCost getMulAccReductionCost(
-  bool IsUnsigned, Type *ResTy, VectorType *Ty,
+  bool IsUnsigned, Type *ResTy, VectorType *Ty, bool Negated,

david-arm wrote:

Is it worth keeping the booleans together, i.e. next to `IsUnsigned`?

https://github.com/llvm/llvm-project/pull/147255
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits