[llvm-branch-commits] [lld] 08dd644 - ReleaseNotes: add lld/ELF notes

2021-08-17 Thread Fangrui Song via llvm-branch-commits

Author: Fangrui Song
Date: 2021-08-17T09:44:19-07:00
New Revision: 08dd644d078a77c5eb804442a100bb4a39f9d0d9

URL: 
https://github.com/llvm/llvm-project/commit/08dd644d078a77c5eb804442a100bb4a39f9d0d9
DIFF: 
https://github.com/llvm/llvm-project/commit/08dd644d078a77c5eb804442a100bb4a39f9d0d9.diff

LOG: ReleaseNotes: add lld/ELF notes

For the release/13.x branch.

Differential Revision: https://reviews.llvm.org/D107782

Added: 


Modified: 
lld/docs/ReleaseNotes.rst

Removed: 




diff  --git a/lld/docs/ReleaseNotes.rst b/lld/docs/ReleaseNotes.rst
index 9ae375523518..50af6e7d7939 100644
--- a/lld/docs/ReleaseNotes.rst
+++ b/lld/docs/ReleaseNotes.rst
@@ -24,6 +24,13 @@ Non-comprehensive list of changes in this release
 ELF Improvements
 
 
+* ``-z start-stop-gc`` is now supported and becomes the default.
+  (`D96914 `_)
+  (`rG6d2d3bd0 
`_)
+* ``--shuffle-sections=`` has been changed to 
``--shuffle-sections==``.
+  If seed is -1, the matched input sections are reversed.
+  (`D98445 `_)
+  (`D98679 `_)
 * ``-Bsymbolic -Bsymbolic-functions`` has been changed to behave the same as 
``-Bsymbolic-functions``. This matches GNU ld.
   (`D102461 `_)
 * ``-Bno-symbolic`` has been added.
@@ -32,6 +39,75 @@ ELF Improvements
   (`D103303 `_)
 * ``-Bsymbolic-non-weak-functions`` has been added as a ``STB_GLOBAL`` subset 
of ``-Bsymbolic-functions``.
   (`D102570 `_)
+* ``--no-allow-shlib-undefined`` has been improved to catch more cases.
+  (`D101996 `_)
+* ``__rela_iplt_start`` is no longer defined for -pie/-shared.
+  This makes GCC/Clang ``-static-pie`` built executables work.
+  (`rG8cb78e99 
`_)
+* IRELATIVE/TLSDESC relocations now support ``-z rel``.
+  (`D100544 `_)
+* Section groups with a zero flag are now supported.
+  This is used by ``comdat nodeduplicate`` in LLVM IR.
+  (`D96636 `_)
+  (`D106228 `_)
+* Defined symbols are now resolved before undefined symbols to stabilize the 
bheavior of archive member extraction.
+  (`D95985 `_)
+* ``STB_WEAK`` symbols are now preferred over COMMON symbols as a fix to a 
``--fortran-common`` regression.
+  (`D105945 `_)
+* Absolute relocations referencing undef weak now produce dynamic relocations 
for -pie, matching GOT-generating relocations.
+  (`D105164 `_)
+* Exported symbols are now communicated to the LTO library so as to make LTO
+  based whole program devirtualization (``-flto=thin -fwhole-program-vtables``)
+  work with shared objects.
+  (`D91583 `_)
+* Whole program devirtualization now respects ``local:`` version nodes in a 
version script.
+  (`D98220 `_)
+  (`D98686 `_)
+* ``local:`` version nodes in a version script now apply to non-default 
version symbols.
+  (`D107234 `_)
+* If an object file defines both ``foo`` and ``foo@v1``, now only ``foo@v1`` 
will be in the output.
+  (`D107235 `_)
+* Copy relocations on non-default version symbols are now supported.
+  (`D107535 `_)
+
+Linker script changes:
+
+* ``.``, ``$``, and double quotes can now be used in symbol names in 
expressions.
+  (`D98306 `_)
+  (`rGe7a7ad13 
`_)
+* Fixed value of ``.`` in the output section description of ``.tbss``.
+  (`D107288 `_)
+* ``NOLOAD`` sections can now be placed in a ``PT_LOAD`` program header.
+  (`D103815 `_)
+* ``OUTPUT_FORMAT(default, big, little)`` now consults ``-EL`` and ``-EB``.
+  (`D96214 `_)
+* The ``OVERWRITE_SECTIONS`` command has been added.
+  (`D103303 `_)
+* The section order within an ``INSERT AFTER`` command is now preserved.
+  (`D105158 `_)
+
+Architecture specific changes:
+
+* aarch64_be is now supported.
+  (`D96188 `_)
+* The AMDGPU port now supports ``--amdhsa-code-object-version=4`` object files;
+  (`D95811 `_)
+* The ARM port now accounts for PC biases in range extension thunk creation.
+  (`D97550 `_)
+* The

[llvm-branch-commits] [llvm] 2e4c11e - [PowerPC] Disable CTR Loop generate for fma with the PPC double double type.

2021-08-17 Thread Tom Stellard via llvm-branch-commits

Author: Amy Kwan
Date: 2021-08-17T20:22:13-07:00
New Revision: 2e4c11ee320941e9836564a96d89d92f79d38021

URL: 
https://github.com/llvm/llvm-project/commit/2e4c11ee320941e9836564a96d89d92f79d38021
DIFF: 
https://github.com/llvm/llvm-project/commit/2e4c11ee320941e9836564a96d89d92f79d38021.diff

LOG: [PowerPC] Disable CTR Loop generate for fma with the PPC double double 
type.

It is possible to generate the llvm.fmuladd.ppcf128 intrinsic, and there is no 
actual
FMA instruction that corresponds to this intrinsic call for ppcf128. Thus, this
intrinsic needs to remain as a call as it cannot be lowered to any instruction, 
which
also means we need to disable CTR loop generation for fma involving the ppcf128 
type.
This patch accomplishes this behaviour.

Differential Revision: https://reviews.llvm.org/D107914

(cherry picked from commit 581a80304c671b6cb2b1b1f87feb9fbe14875f2a)

Added: 
llvm/test/CodeGen/PowerPC/disable-ctr-ppcf128.ll

Modified: 
llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp

Removed: 




diff  --git a/llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp 
b/llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
index d5a7873bd056e..abf5b213bbace 100644
--- a/llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
+++ b/llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp
@@ -485,6 +485,9 @@ bool PPCTTIImpl::mightUseCTR(BasicBlock *BB, 
TargetLibraryInfo *LibInfo,
   case Intrinsic::experimental_constrained_sin:
   case Intrinsic::experimental_constrained_cos:
 return true;
+  // There is no corresponding FMA instruction for PPC double double.
+  // Thus, we need to disable CTR loop generation for this type.
+  case Intrinsic::fmuladd:
   case Intrinsic::copysign:
 if (CI->getArgOperand(0)->getType()->getScalarType()->
 isPPC_FP128Ty())

diff  --git a/llvm/test/CodeGen/PowerPC/disable-ctr-ppcf128.ll 
b/llvm/test/CodeGen/PowerPC/disable-ctr-ppcf128.ll
new file mode 100644
index 0..fef538c365130
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/disable-ctr-ppcf128.ll
@@ -0,0 +1,113 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mcpu=pwr9 -verify-machineinstrs -ppc-asm-full-reg-names \
+; RUN:   -mtriple=powerpc64le-unknown-linux-gnu < %s | FileCheck %s 
--check-prefix=LE
+; RUN: llc -mcpu=pwr9 -verify-machineinstrs -ppc-asm-full-reg-names \
+; RUN:-mtriple=powerpc64-unknown-linux-gnu < %s | FileCheck %s 
--check-prefix=P9BE
+; RUN: llc -mcpu=pwr8 -verify-machineinstrs -ppc-asm-full-reg-names \
+; RUN:-mtriple=powerpc64le-unknown-linux-gnu < %s | FileCheck %s 
--check-prefix=LE
+; RUN: llc -mcpu=pwr8 -verify-machineinstrs -ppc-asm-full-reg-names \
+; RUN:-mtriple=powerpc64-unknown-linux-gnu < %s | FileCheck %s 
--check-prefix=P8BE
+
+declare ppc_fp128 @llvm.fmuladd.ppcf128(ppc_fp128, ppc_fp128, ppc_fp128) #2
+
+define ppc_fp128 @test_ctr0() {
+; LE-LABEL: test_ctr0:
+; LE:   # %bb.0: # %bb
+; LE-NEXT:mflr r0
+; LE-NEXT:.cfi_def_cfa_offset 48
+; LE-NEXT:.cfi_offset lr, 16
+; LE-NEXT:.cfi_offset r30, -16
+; LE-NEXT:std r30, -16(r1) # 8-byte Folded Spill
+; LE-NEXT:std r0, 16(r1)
+; LE-NEXT:stdu r1, -48(r1)
+; LE-NEXT:xxlxor f1, f1, f1
+; LE-NEXT:li r30, 0
+; LE-NEXT:xxlxor f2, f2, f2
+; LE-NEXT:.p2align 5
+; LE-NEXT:  .LBB0_1: # %bb6
+; LE-NEXT:#
+; LE-NEXT:xxlxor f3, f3, f3
+; LE-NEXT:xxlxor f4, f4, f4
+; LE-NEXT:bl __gcc_qadd
+; LE-NEXT:nop
+; LE-NEXT:addi r30, r30, 4
+; LE-NEXT:cmpldi r30, 0
+; LE-NEXT:bne cr0, .LBB0_1
+; LE-NEXT:  # %bb.2: # %bb14
+; LE-NEXT:addi r1, r1, 48
+; LE-NEXT:ld r0, 16(r1)
+; LE-NEXT:ld r30, -16(r1) # 8-byte Folded Reload
+; LE-NEXT:mtlr r0
+; LE-NEXT:blr
+;
+; P9BE-LABEL: test_ctr0:
+; P9BE:   # %bb.0: # %bb
+; P9BE-NEXT:mflr r0
+; P9BE-NEXT:std r0, 16(r1)
+; P9BE-NEXT:stdu r1, -128(r1)
+; P9BE-NEXT:.cfi_def_cfa_offset 128
+; P9BE-NEXT:.cfi_offset lr, 16
+; P9BE-NEXT:.cfi_offset r30, -16
+; P9BE-NEXT:std r30, 112(r1) # 8-byte Folded Spill
+; P9BE-NEXT:xxlxor f1, f1, f1
+; P9BE-NEXT:li r30, 0
+; P9BE-NEXT:xxlxor f2, f2, f2
+; P9BE-NEXT:.p2align 5
+; P9BE-NEXT:  .LBB0_1: # %bb6
+; P9BE-NEXT:#
+; P9BE-NEXT:xxlxor f3, f3, f3
+; P9BE-NEXT:xxlxor f4, f4, f4
+; P9BE-NEXT:bl __gcc_qadd
+; P9BE-NEXT:nop
+; P9BE-NEXT:addi r30, r30, 4
+; P9BE-NEXT:cmpldi r30, 0
+; P9BE-NEXT:bne cr0, .LBB0_1
+; P9BE-NEXT:  # %bb.2: # %bb14
+; P9BE-NEXT:ld r30, 112(r1) # 8-byte Folded Reload
+; P9BE-NEXT:addi r1, r1, 128
+; P9BE-NEXT:ld r0, 16(r1)
+; P9BE-NEXT:mtlr r0
+; P9BE-NEXT:blr
+;
+; P8BE-LABEL: test_ctr0:
+; P8BE:   # %bb.0: # %bb
+; P8BE-NEXT:mflr r0
+; P8BE-NEXT:std r0, 16(r1)
+; P8BE-NEXT:stdu r1, -128(r1)
+; P8BE-NEXT:.cfi_def_cfa_off

[llvm-branch-commits] [flang] 0c25e01 - [Flang] Fix build failure on MacOS

2021-08-17 Thread Tom Stellard via llvm-branch-commits

Author: Kiran Chandramohan
Date: 2021-08-17T20:22:13-07:00
New Revision: 0c25e0174861548ade7cd34671067adbcc0ce5a9

URL: 
https://github.com/llvm/llvm-project/commit/0c25e0174861548ade7cd34671067adbcc0ce5a9
DIFF: 
https://github.com/llvm/llvm-project/commit/0c25e0174861548ade7cd34671067adbcc0ce5a9.diff

LOG: [Flang] Fix build failure on MacOS

std::clock_t can be an unsigned value on some platforms like MacOS and
therefore needs a cast when initializing an std::clock_t value with -1.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D107972

(cherry picked from commit 4573c31f8945071d0069dcad31e17ddfeb7a2d8c)

Added: 


Modified: 
flang/runtime/time-intrinsic.cpp

Removed: 




diff  --git a/flang/runtime/time-intrinsic.cpp 
b/flang/runtime/time-intrinsic.cpp
index 5e7c1bc484d55..d6b1c36bf9e00 100644
--- a/flang/runtime/time-intrinsic.cpp
+++ b/flang/runtime/time-intrinsic.cpp
@@ -36,7 +36,7 @@ using preferred_implementation = int;
 // This is the fallback implementation, which should work everywhere.
 template  double GetCpuTime(fallback_implementation) {
   std::clock_t timestamp{std::clock()};
-  if (timestamp != std::clock_t{-1}) {
+  if (timestamp != static_cast(-1)) {
 return static_cast(timestamp) / CLOCKS_PER_SEC;
   }
 



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 5a87328 - more ds/dq preparation

2021-08-17 Thread Chen Zheng via llvm-branch-commits

Author: Chen Zheng
Date: 2021-07-15T07:54:49Z
New Revision: 5a8732852b4d7225acaa347f705798fe7d61e92c

URL: 
https://github.com/llvm/llvm-project/commit/5a8732852b4d7225acaa347f705798fe7d61e92c
DIFF: 
https://github.com/llvm/llvm-project/commit/5a8732852b4d7225acaa347f705798fe7d61e92c.diff

LOG: more ds/dq preparation

Added: 


Modified: 
llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp
llvm/test/CodeGen/PowerPC/loop-instr-prep-non-const-increasement.ll
llvm/test/CodeGen/PowerPC/lsr-profitable-chain.ll

Removed: 




diff  --git a/llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp 
b/llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp
index 1d2b1ed3f6269..5f08268277a0e 100644
--- a/llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp
+++ b/llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp
@@ -169,7 +169,7 @@ namespace {
 
   private:
 PPCTargetMachine *TM = nullptr;
-const PPCSubtarget *ST; 
+const PPCSubtarget *ST;
 DominatorTree *DT;
 LoopInfo *LI;
 ScalarEvolution *SE;
@@ -184,10 +184,13 @@ namespace {
 bool runOnLoop(Loop *L);
 
 /// Check if required PHI node is already exist in Loop \p L.
-bool alreadyPrepared(Loop *L, Instruction* MemI,
+bool alreadyPrepared(Loop *L, Instruction *MemI,
  const SCEV *BasePtrStartSCEV,
- const SCEVConstant *BasePtrIncSCEV,
- InstrForm Form);
+ const SCEV *BasePtrIncSCEV, InstrForm Form);
+
+/// Get the value which defines the increment SCEV \p BasePtrIncSCEV.
+Value *getPreparedIncNode(Loop *L, Instruction *MemI,
+  const SCEV *BasePtrIncSCEV);
 
 /// Collect condition matched(\p isValidCandidate() returns true)
 /// candidates in Loop \p L.
@@ -266,7 +269,7 @@ static std::string getInstrName(const Value *I, StringRef 
Suffix) {
   if (I->hasName())
 return (I->getName() + Suffix).str();
   else
-return ""; 
+return "";
 }
 
 static Value *GetPointerOperand(Value *MemI) {
@@ -404,13 +407,13 @@ bool 
PPCLoopInstrFormPrep::prepareBaseForDispFormChain(Bucket &BucketChain,
   // contains following load/stores with 
diff erent remainders:
   // 1: 10 load/store whose remainder is 1;
   // 2: 9 load/store whose remainder is 2;
-  // 3: 1 for remainder 3 and 0 for remainder 0; 
+  // 3: 1 for remainder 3 and 0 for remainder 0;
   // Now we will choose the first load/store whose remainder is 1 as base and
   // adjust all other load/stores according to new base, so we will get 10 DS
   // form and 10 X form.
   // But we should be more clever, for this case we could use two bases, one 
for
-  // remainder 1 and the other for remainder 2, thus we could get 19 DS form 
and 1
-  // X form.
+  // remainder 1 and the other for remainder 2, thus we could get 19 DS form 
and
+  // 1 X form.
   unsigned MaxCountRemainder = 0;
   for (unsigned j = 0; j < (unsigned)Form; j++)
 if ((RemainderOffsetInfo.find(j) != RemainderOffsetInfo.end()) &&
@@ -515,28 +518,48 @@ bool PPCLoopInstrFormPrep::rewriteLoadStores(Loop *L, 
Bucket &BucketChain,
   if (!SE->isLoopInvariant(BasePtrSCEV->getStart(), L))
 return MadeChange;
 
-  const SCEVConstant *BasePtrIncSCEV =
-dyn_cast(BasePtrSCEV->getStepRecurrence(*SE));
-  if (!BasePtrIncSCEV)
+  bool IsConstantInc = false;
+  const SCEV *BasePtrIncSCEV = BasePtrSCEV->getStepRecurrence(*SE);
+  Value *IncNode = getPreparedIncNode(L, MemI, BasePtrIncSCEV);
+
+  const SCEVConstant *BasePtrIncConstantSCEV =
+  dyn_cast(BasePtrIncSCEV);
+  if (BasePtrIncConstantSCEV)
+IsConstantInc = true;
+
+  // No valid representation for the increment.
+  if (!IncNode) {
+LLVM_DEBUG(dbgs() << "Loop Increasement can not be represented!\n");
 return MadeChange;
+  }
+
+  // Now we only handle update form for constant increment.
+  // FIXME: add support for non-constant increment UpdateForm.
+  if (!IsConstantInc && Form == UpdateForm) {
+LLVM_DEBUG(dbgs() << "not a constant incresement for update form!\n");
+return MadeChange;
+  }
 
   // For some DS form load/store instructions, it can also be an update form,
   // if the stride is a multipler of 4. Use update form if prefer it.
-  bool CanPreInc = (Form == UpdateForm ||
-((Form == DSForm) && !BasePtrIncSCEV->getAPInt().urem(4) &&
- PreferUpdateForm));
+  bool CanPreInc =
+  (Form == UpdateForm ||
+   ((Form == DSForm) && IsConstantInc &&
+!BasePtrIncConstantSCEV->getAPInt().urem(4) && PreferUpdateForm));
   const SCEV *BasePtrStartSCEV = nullptr;
   if (CanPreInc)
 BasePtrStartSCEV =
-SE->getMinusSCEV(BasePtrSCEV->getStart(), BasePtrIncSCEV);
+SE->getMinusSCEV(BasePtrSCEV->getStart(), BasePtrIncConstantSCEV);
   else
 BasePtrStartSCEV = BasePtrSCEV->getStart();
 
   if (!isSafeToExpand(BasePtrStartSCEV, *SE))
 return MadeCha

[llvm-branch-commits] [llvm] ee7e6c4 - common chains

2021-08-17 Thread Chen Zheng via llvm-branch-commits

Author: Chen Zheng
Date: 2021-08-18T03:20:39Z
New Revision: ee7e6c4e05af743c1ba2db57abd33fb828d49025

URL: 
https://github.com/llvm/llvm-project/commit/ee7e6c4e05af743c1ba2db57abd33fb828d49025
DIFF: 
https://github.com/llvm/llvm-project/commit/ee7e6c4e05af743c1ba2db57abd33fb828d49025.diff

LOG: common chains

Added: 


Modified: 
llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp

Removed: 




diff  --git a/llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp 
b/llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp
index 010f49c8d3ebc..c3d3f1504fd4d 100644
--- a/llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp
+++ b/llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp
@@ -140,6 +140,38 @@ namespace {
 SmallVector Elements;
   };
 
+  struct ChainBucketElement {
+ChainBucketElement(const SCEV *O, Instruction *I) : Offset(O), Instr(I) {}
+ChainBucketElement(Instruction *I) : Offset(nullptr), Instr(I) {}
+
+const SCEV *Offset;
+Instruction *Instr;
+  };
+
+  struct ChainBucket {
+ChainBucket(const SCEV *B, Instruction *I) : BaseSCEV(B),
+Elements(1, ChainBucketElement(I)) 
{ ChainSize = 0; }
+
+const SCEV *BaseSCEV;
+   // Value *Ptr;
+SmallVector Elements;
+unsigned ChainSize;
+SmallVector ChainBases;
+//SmallVector RewriteBuckets;
+void dump() {
+  LLVM_DEBUG(dbgs() << "Chain base scev is "; BaseSCEV->dump());
+  LLVM_DEBUG(dbgs() << "Chain element size is "<< Elements.size() << "\n");
+  for (auto E : Elements) {
+if (!E.Offset)
+LLVM_DEBUG(dbgs() << "base Element Instruction is "; E.Instr->dump());
+else {
+  LLVM_DEBUG(dbgs() << "Element offset is "; E.Offset->dump());
+  LLVM_DEBUG(dbgs() << "Element instruction is "; E.Instr->dump());
+}
+  }
+}
+  };
+
   // "UpdateForm" is not a real PPC instruction form, it stands for dform
   // load/store with update like ldu/stdu, or Prefetch intrinsic.
   // For DS form instructions, their displacements must be multiple of 4.
@@ -192,6 +224,21 @@ namespace {
 Value *getPreparedIncNode(Loop *L, Instruction *MemI,
   const SCEV *BasePtrIncSCEV);
 
+/// Collect chain load/store candidates in Loop \p L.
+SmallVector  collectCandidatesForChain(Loop *L);
+
+/// Add a candidate to candidates \p Buckets for chain.
+void addOneCandidateForChain(Instruction *MemI, const SCEV *LSCEV, 
SmallVector &Buckets);
+
+/// Common chains to reuse offsets for a loop to reduce register pressure.
+bool chainCommoning(Loop *L, SmallVector &ChainBuckets);
+
+bool prepareBasesForChains(ChainBucket &BucketChain);
+
+bool rewriteLoadStoresForChains(Loop *L, ChainBucket &Bucket,
+   SmallSet &BBChanged,
+DenseMap &ExpandedOffsets);
+
 /// Collect condition matched(\p isValidCandidate() returns true)
 /// candidates in Loop \p L.
 SmallVector collectCandidates(
@@ -272,7 +319,7 @@ static std::string getInstrName(const Value *I, StringRef 
Suffix) {
 return "";
 }
 
-static Value *GetPointerOperand(Value *MemI) {
+static Value *getPtrOperand(Value *MemI) {
   if (LoadInst *LMemI = dyn_cast(MemI)) {
 return LMemI->getPointerOperand();
   } else if (StoreInst *SMemI = dyn_cast(MemI)) {
@@ -309,10 +356,448 @@ bool PPCLoopInstrFormPrep::runOnFunction(Function &F) {
   return MadeChange;
 }
 
+// check if the SCEV is only with one ptr operand in its start, so that we can
+// use that start as a chain separator.
+static bool isValidChainCandidate(const SCEV *LSCEV)
+{
+  const SCEVAddRecExpr *ARSCEV = cast(LSCEV);
+  if (!ARSCEV)
+return false;
+
+  if (!ARSCEV->isAffine())
+return false;
+
+  const SCEV *Start = ARSCEV->getStart();
+  LLVM_DEBUG(dbgs() << "Start SCEV is "; Start->dump());
+  LLVM_DEBUG(dbgs() << "Start SCEV type is "; Start->getType()->dump());
+
+  LLVM_DEBUG(dbgs() << "start is unknown is " << isa(Start) << 
"\n");
+
+  // A single pointer.
+  if (isa(Start) && Start->getType()->isPointerTy())
+return true;
+
+  const SCEVAddExpr *ASCEV = dyn_cast(Start);
+
+  // Now we only handle SCEVAddExpr.
+  if (!ASCEV)
+return false;
+
+  bool SawPointer = false;
+  LLVM_DEBUG(dbgs() << "operand number is " << ASCEV->getNumOperands() << 
"\n");
+  int i = 0;
+  for (const SCEV *Op : ASCEV->operands()) {
+i++;
+LLVM_DEBUG(dbgs() << "operand " << i << " is "; Op->dump());
+LLVM_DEBUG(dbgs() << "operand " << i << " type is "; 
Op->getType()->dump());
+if (Op->getType()->isPointerTy()) {
+  if (SawPointer)
+return false;
+  SawPointer = true;
+}
+else if (!Op->getType()->isIntegerTy())
+  return false;
+  }
+
+  return SawPointer;
+}
+
+// Make sure the 
diff  between the base and new candidate is:
+// 1: an integer type.
+// 2: does not contain any pointer type.
+st

[llvm-branch-commits] [llvm] cb317d6 - update form prepare

2021-08-17 Thread Chen Zheng via llvm-branch-commits

Author: Chen Zheng
Date: 2021-07-15T08:02:10Z
New Revision: cb317d60cca17f5ab60bd841b0d25a145cedfa70

URL: 
https://github.com/llvm/llvm-project/commit/cb317d60cca17f5ab60bd841b0d25a145cedfa70
DIFF: 
https://github.com/llvm/llvm-project/commit/cb317d60cca17f5ab60bd841b0d25a145cedfa70.diff

LOG: update form prepare

Added: 


Modified: 
llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp
llvm/test/CodeGen/PowerPC/loop-instr-prep-non-const-increasement.ll
llvm/test/CodeGen/PowerPC/lsr-profitable-chain.ll

Removed: 




diff  --git a/llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp 
b/llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp
index 5f08268277a0e..010f49c8d3ebc 100644
--- a/llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp
+++ b/llvm/lib/Target/PowerPC/PPCLoopInstrFormPrep.cpp
@@ -533,13 +533,6 @@ bool PPCLoopInstrFormPrep::rewriteLoadStores(Loop *L, 
Bucket &BucketChain,
 return MadeChange;
   }
 
-  // Now we only handle update form for constant increment.
-  // FIXME: add support for non-constant increment UpdateForm.
-  if (!IsConstantInc && Form == UpdateForm) {
-LLVM_DEBUG(dbgs() << "not a constant incresement for update form!\n");
-return MadeChange;
-  }
-
   // For some DS form load/store instructions, it can also be an update form,
   // if the stride is a multipler of 4. Use update form if prefer it.
   bool CanPreInc =
@@ -547,10 +540,13 @@ bool PPCLoopInstrFormPrep::rewriteLoadStores(Loop *L, 
Bucket &BucketChain,
((Form == DSForm) && IsConstantInc &&
 !BasePtrIncConstantSCEV->getAPInt().urem(4) && PreferUpdateForm));
   const SCEV *BasePtrStartSCEV = nullptr;
-  if (CanPreInc)
-BasePtrStartSCEV =
-SE->getMinusSCEV(BasePtrSCEV->getStart(), BasePtrIncConstantSCEV);
-  else
+  if (CanPreInc) {
+assert(SE->isLoopInvariant(BasePtrIncSCEV, L) &&
+   "Increment is not loop invariant!\n");
+BasePtrStartSCEV = SE->getMinusSCEV(BasePtrSCEV->getStart(),
+IsConstantInc ? BasePtrIncConstantSCEV
+  : BasePtrIncSCEV);
+  } else
 BasePtrStartSCEV = BasePtrSCEV->getStart();
 
   if (!isSafeToExpand(BasePtrStartSCEV, *SE))
@@ -588,12 +584,10 @@ bool PPCLoopInstrFormPrep::rewriteLoadStores(Loop *L, 
Bucket &BucketChain,
   Instruction *PtrInc = nullptr;
   Instruction *NewBasePtr = nullptr;
   if (CanPreInc) {
-assert(BasePtrIncConstantSCEV &&
-   "update form now only supports constant increment.");
 Instruction *InsPoint = &*Header->getFirstInsertionPt();
-PtrInc = GetElementPtrInst::Create(
-I8Ty, NewPHI, BasePtrIncConstantSCEV->getValue(),
-getInstrName(MemI, GEPNodeIncNameSuffix), InsPoint);
+PtrInc = GetElementPtrInst::Create(I8Ty, NewPHI, IncNode,
+   getInstrName(MemI, 
GEPNodeIncNameSuffix),
+   InsPoint);
 cast(PtrInc)->setIsInBounds(IsPtrInBounds(BasePtr));
 for (auto PI : predecessors(Header)) {
   if (PI == LoopPredecessor)

diff  --git 
a/llvm/test/CodeGen/PowerPC/loop-instr-prep-non-const-increasement.ll 
b/llvm/test/CodeGen/PowerPC/loop-instr-prep-non-const-increasement.ll
index 6132074004305..6628ac89f79ad 100644
--- a/llvm/test/CodeGen/PowerPC/loop-instr-prep-non-const-increasement.ll
+++ b/llvm/test/CodeGen/PowerPC/loop-instr-prep-non-const-increasement.ll
@@ -85,18 +85,17 @@ define zeroext i8 @foo1(i8* %p, i32 signext %n, i32 signext 
%count) {
 ; CHECK-NEXT:cmpwi r4, 1
 ; CHECK-NEXT:blt cr0, .LBB1_4
 ; CHECK-NEXT:  # %bb.1: # %for.body.preheader
+; CHECK-NEXT:extsw r5, r5
+; CHECK-NEXT:sub r3, r3, r5
 ; CHECK-NEXT:addi r6, r3, 1000
 ; CHECK-NEXT:clrldi r3, r4, 32
-; CHECK-NEXT:extsw r5, r5
-; CHECK-NEXT:li r4, 0
 ; CHECK-NEXT:mtctr r3
 ; CHECK-NEXT:li r3, 0
 ; CHECK-NEXT:.p2align 4
 ; CHECK-NEXT:  .LBB1_2: # %for.body
 ; CHECK-NEXT:#
-; CHECK-NEXT:lbzx r7, r6, r4
-; CHECK-NEXT:add r4, r4, r5
-; CHECK-NEXT:add r3, r7, r3
+; CHECK-NEXT:lbzux r4, r6, r5
+; CHECK-NEXT:add r3, r4, r3
 ; CHECK-NEXT:bdnz .LBB1_2
 ; CHECK-NEXT:  # %bb.3: # %for.cond.cleanup
 ; CHECK-NEXT:clrldi r3, r3, 56

diff  --git a/llvm/test/CodeGen/PowerPC/lsr-profitable-chain.ll 
b/llvm/test/CodeGen/PowerPC/lsr-profitable-chain.ll
index 346353bc12d0a..f2d8157faba32 100644
--- a/llvm/test/CodeGen/PowerPC/lsr-profitable-chain.ll
+++ b/llvm/test/CodeGen/PowerPC/lsr-profitable-chain.ll
@@ -50,26 +50,26 @@ define void @foo(double* readonly %0, double* %1, i64 %2, 
i64 %3, i64 %4, i64 %5
 ; CHECK-NEXT:cmpd 6, 24
 ; CHECK-NEXT:bge 0, .LBB0_2
 ; CHECK-NEXT:  # %bb.4:
-; CHECK-NEXT:maddld 19, 0, 27, 30
+; CHECK-NEXT:maddld 21, 0, 27, 30
 ; CHECK-NEXT:maddld 20, 0, 27, 12
-; CHECK-NEXT:add 22, 6, 28
-; CHECK-NEXT:add 21, 6, 8
+; CHECK-NEXT:add 23, 6, 28
+; CHE