Re: [TCWG CI] 462.libquantum grew in size by 3% after llvm: [JumpThreading] Ignore free instructions

Maxim Kuvyrkov Thu, 30 Sep 2021 03:09:51 -0700

Also, it slows down 464.h264ref by 7% at -O3 -flto: 
https://lists.linaro.org/pipermail/linaro-toolchain/2021-September/007882.html


Regards,

--
Maxim Kuvyrkov
https://www.linaro.org

> On 27 Sep 2021, at 15:59, Maxim Kuvyrkov <maxim.kuvyr...@linaro.org> wrote:
> 
> Hi Nikita,
> 
> FYI, your patch seems to increase code-size of 462.libquantum by 3%.  Would 
> you please check if this could be easily fixed?
> 
> Regards,
> 
> --
> Maxim Kuvyrkov
> https://www.linaro.org
> 
>> On 27 Sep 2021, at 01:34, ci_not...@linaro.org wrote:
>> 
>> After llvm commit 1e3c6fc7cb9d2ee6a5328881f95d6643afeadbff
>> Author: Nikita Popov <nikita....@gmail.com>
>> 
>>   [JumpThreading] Ignore free instructions
>> 
>> the following benchmarks grew in size by more than 1%:
>> - 462.libquantum grew in size by 3% from 14035 to 14398 bytes
>> 
>> Below reproducer instructions can be used to re-build both "first_bad" and 
>> "last_good" cross-toolchains used in this bisection.  Naturally, the scripts 
>> will fail when triggerring benchmarking jobs if you don't have access to 
>> Linaro TCWG CI.
>> 
>> For your convenience, we have uploaded tarballs with pre-processed source 
>> and assembly files at:
>> - First_bad save-temps: 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-arm-spec2k6-Os_LTO/4/artifact/artifacts/build-1e3c6fc7cb9d2ee6a5328881f95d6643afeadbff/save-temps/
>> - Last_good save-temps: 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-arm-spec2k6-Os_LTO/4/artifact/artifacts/build-1a6e1ee42a6af255d45e3fd2fe87021dd31f79bb/save-temps/
>> - Baseline save-temps: 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-arm-spec2k6-Os_LTO/4/artifact/artifacts/build-baseline/save-temps/
>> 
>> Configuration:
>> - Benchmark: SPEC CPU2006
>> - Toolchain: Clang + Glibc + LLVM Linker
>> - Version: all components were built from their tip of trunk
>> - Target: arm-linux-gnueabihf
>> - Compiler flags: -Os -flto -mthumb
>> - Hardware: APM Mustang 8x X-Gene1
>> 
>> This benchmarking CI is work-in-progress, and we welcome feedback and 
>> suggestions at linaro-toolchain@lists.linaro.org .  In our improvement plans 
>> is to add support for SPEC CPU2017 benchmarks and provide "perf 
>> report/annotate" data behind these reports.
>> 
>> THIS IS THE END OF INTERESTING STUFF.  BELOW ARE LINKS TO BUILDS, 
>> REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT.
>> 
>> This commit has regressed these CI configurations:
>> - tcwg_bmk_llvm_apm/llvm-master-arm-spec2k6-Os_LTO
>> 
>> First_bad build: 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-arm-spec2k6-Os_LTO/4/artifact/artifacts/build-1e3c6fc7cb9d2ee6a5328881f95d6643afeadbff/
>> Last_good build: 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-arm-spec2k6-Os_LTO/4/artifact/artifacts/build-1a6e1ee42a6af255d45e3fd2fe87021dd31f79bb/
>> Baseline build: 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-arm-spec2k6-Os_LTO/4/artifact/artifacts/build-baseline/
>> Even more details: 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-arm-spec2k6-Os_LTO/4/artifact/artifacts/
>> 
>> Reproduce builds:
>> <cut>
>> mkdir investigate-llvm-1e3c6fc7cb9d2ee6a5328881f95d6643afeadbff
>> cd investigate-llvm-1e3c6fc7cb9d2ee6a5328881f95d6643afeadbff
>> 
>> # Fetch scripts
>> git clone https://git.linaro.org/toolchain/jenkins-scripts
>> 
>> # Fetch manifests and test.sh script
>> mkdir -p artifacts/manifests
>> curl -o artifacts/manifests/build-baseline.sh 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-arm-spec2k6-Os_LTO/4/artifact/artifacts/manifests/build-baseline.sh
>>  --fail
>> curl -o artifacts/manifests/build-parameters.sh 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-arm-spec2k6-Os_LTO/4/artifact/artifacts/manifests/build-parameters.sh
>>  --fail
>> curl -o artifacts/test.sh 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_apm-llvm-master-arm-spec2k6-Os_LTO/4/artifact/artifacts/test.sh
>>  --fail
>> chmod +x artifacts/test.sh
>> 
>> # Reproduce the baseline build (build all pre-requisites)
>> ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh
>> 
>> # Save baseline build state (which is then restored in artifacts/test.sh)
>> mkdir -p ./bisect
>> rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ 
>> --exclude /llvm/ ./ ./bisect/baseline/
>> 
>> cd llvm
>> 
>> # Reproduce first_bad build
>> git checkout --detach 1e3c6fc7cb9d2ee6a5328881f95d6643afeadbff
>> ../artifacts/test.sh
>> 
>> # Reproduce last_good build
>> git checkout --detach 1a6e1ee42a6af255d45e3fd2fe87021dd31f79bb
>> ../artifacts/test.sh
>> 
>> cd ..
>> </cut>
>> 
>> Full commit (up to 1000 lines):
>> <cut>
>> commit 1e3c6fc7cb9d2ee6a5328881f95d6643afeadbff
>> Author: Nikita Popov <nikita....@gmail.com>
>> Date:   Wed Sep 22 21:34:24 2021 +0200
>> 
>>   [JumpThreading] Ignore free instructions
>> 
>>   This is basically D108837 but for jump threading. Free instructions
>>   should be ignored for the threading decision. JumpThreading already
>>   skips some free instructions (like pointer bitcasts), but does not
>>   skip various free intrinsics -- in fact, it currently gives them a
>>   fairly large cost of 2.
>> 
>>   Differential Revision: https://reviews.llvm.org/D110290
>> ---
>> .../include/llvm/Transforms/Scalar/JumpThreading.h |  8 +--
>> llvm/lib/Transforms/Scalar/JumpThreading.cpp       | 61 
>> ++++++++++------------
>> .../Transforms/JumpThreading/free_instructions.ll  | 24 +++++----
>> .../inlining-alignment-assumptions.ll              | 12 ++---
>> 4 files changed, 52 insertions(+), 53 deletions(-)
>> 
>> diff --git a/llvm/include/llvm/Transforms/Scalar/JumpThreading.h 
>> b/llvm/include/llvm/Transforms/Scalar/JumpThreading.h
>> index 816ea1071e52..0ac7d7c62b7a 100644
>> --- a/llvm/include/llvm/Transforms/Scalar/JumpThreading.h
>> +++ b/llvm/include/llvm/Transforms/Scalar/JumpThreading.h
>> @@ -44,6 +44,7 @@ class PHINode;
>> class SelectInst;
>> class SwitchInst;
>> class TargetLibraryInfo;
>> +class TargetTransformInfo;
>> class Value;
>> 
>> /// A private "module" namespace for types and utilities used by
>> @@ -78,6 +79,7 @@ enum ConstantPreference { WantInteger, WantBlockAddress };
>> /// revectored to the false side of the second if.
>> class JumpThreadingPass : public PassInfoMixin<JumpThreadingPass> {
>>  TargetLibraryInfo *TLI;
>> +  TargetTransformInfo *TTI;
>>  LazyValueInfo *LVI;
>>  AAResults *AA;
>>  DomTreeUpdater *DTU;
>> @@ -99,9 +101,9 @@ public:
>>  JumpThreadingPass(bool InsertFreezeWhenUnfoldingSelect = false, int T = -1);
>> 
>>  // Glue for old PM.
>> -  bool runImpl(Function &F, TargetLibraryInfo *TLI, LazyValueInfo *LVI,
>> -               AAResults *AA, DomTreeUpdater *DTU, bool HasProfileData,
>> -               std::unique_ptr<BlockFrequencyInfo> BFI,
>> +  bool runImpl(Function &F, TargetLibraryInfo *TLI, TargetTransformInfo 
>> *TTI,
>> +               LazyValueInfo *LVI, AAResults *AA, DomTreeUpdater *DTU,
>> +               bool HasProfileData, std::unique_ptr<BlockFrequencyInfo> BFI,
>>               std::unique_ptr<BranchProbabilityInfo> BPI);
>> 
>>  PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
>> diff --git a/llvm/lib/Transforms/Scalar/JumpThreading.cpp 
>> b/llvm/lib/Transforms/Scalar/JumpThreading.cpp
>> index 688902ecb9ff..fe9a7211967c 100644
>> --- a/llvm/lib/Transforms/Scalar/JumpThreading.cpp
>> +++ b/llvm/lib/Transforms/Scalar/JumpThreading.cpp
>> @@ -331,7 +331,7 @@ bool JumpThreading::runOnFunction(Function &F) {
>>    BFI.reset(new BlockFrequencyInfo(F, *BPI, LI));
>>  }
>> 
>> -  bool Changed = Impl.runImpl(F, TLI, LVI, AA, &DTU, F.hasProfileData(),
>> +  bool Changed = Impl.runImpl(F, TLI, TTI, LVI, AA, &DTU, 
>> F.hasProfileData(),
>>                              std::move(BFI), std::move(BPI));
>>  if (PrintLVIAfterJumpThreading) {
>>    dbgs() << "LVI for function '" << F.getName() << "':\n";
>> @@ -360,7 +360,7 @@ PreservedAnalyses JumpThreadingPass::run(Function &F,
>>    BFI.reset(new BlockFrequencyInfo(F, *BPI, LI));
>>  }
>> 
>> -  bool Changed = runImpl(F, &TLI, &LVI, &AA, &DTU, F.hasProfileData(),
>> +  bool Changed = runImpl(F, &TLI, &TTI, &LVI, &AA, &DTU, F.hasProfileData(),
>>                         std::move(BFI), std::move(BPI));
>> 
>>  if (PrintLVIAfterJumpThreading) {
>> @@ -377,12 +377,14 @@ PreservedAnalyses JumpThreadingPass::run(Function &F,
>> }
>> 
>> bool JumpThreadingPass::runImpl(Function &F, TargetLibraryInfo *TLI_,
>> -                                LazyValueInfo *LVI_, AliasAnalysis *AA_,
>> -                                DomTreeUpdater *DTU_, bool HasProfileData_,
>> +                                TargetTransformInfo *TTI_, LazyValueInfo 
>> *LVI_,
>> +                                AliasAnalysis *AA_, DomTreeUpdater *DTU_,
>> +                                bool HasProfileData_,
>>                                std::unique_ptr<BlockFrequencyInfo> BFI_,
>>                                std::unique_ptr<BranchProbabilityInfo> BPI_) {
>>  LLVM_DEBUG(dbgs() << "Jump threading on function '" << F.getName() << 
>> "'\n");
>>  TLI = TLI_;
>> +  TTI = TTI_;
>>  LVI = LVI_;
>>  AA = AA_;
>>  DTU = DTU_;
>> @@ -514,7 +516,8 @@ static void replaceFoldableUses(Instruction *Cond, Value 
>> *ToVal) {
>> /// Return the cost of duplicating a piece of this block from first non-phi
>> /// and before StopAt instruction to thread across it. Stop scanning the 
>> block
>> /// when exceeding the threshold. If duplication is impossible, returns ~0U.
>> -static unsigned getJumpThreadDuplicationCost(BasicBlock *BB,
>> +static unsigned getJumpThreadDuplicationCost(const TargetTransformInfo *TTI,
>> +                                             BasicBlock *BB,
>>                                             Instruction *StopAt,
>>                                             unsigned Threshold) {
>>  assert(StopAt->getParent() == BB && "Not an instruction from proper BB?");
>> @@ -550,26 +553,21 @@ static unsigned 
>> getJumpThreadDuplicationCost(BasicBlock *BB,
>>    if (Size > Threshold)
>>      return Size;
>> 
>> -    // Debugger intrinsics don't incur code size.
>> -    if (isa<DbgInfoIntrinsic>(I)) continue;
>> -
>> -    // Pseudo-probes don't incur code size.
>> -    if (isa<PseudoProbeInst>(I))
>> -      continue;
>> -
>> -    // If this is a pointer->pointer bitcast, it is free.
>> -    if (isa<BitCastInst>(I) && I->getType()->isPointerTy())
>> -      continue;
>> -
>> -    // Freeze instruction is free, too.
>> -    if (isa<FreezeInst>(I))
>> -      continue;
>> -
>>    // Bail out if this instruction gives back a token type, it is not 
>> possible
>>    // to duplicate it if it is used outside this BB.
>>    if (I->getType()->isTokenTy() && I->isUsedOutsideOfBlock(BB))
>>      return ~0U;
>> 
>> +    // Blocks with NoDuplicate are modelled as having infinite cost, so they
>> +    // are never duplicated.
>> +    if (const CallInst *CI = dyn_cast<CallInst>(I))
>> +      if (CI->cannotDuplicate() || CI->isConvergent())
>> +        return ~0U;
>> +
>> +    if (TTI->getUserCost(&*I, TargetTransformInfo::TCK_SizeAndLatency)
>> +            == TargetTransformInfo::TCC_Free)
>> +      continue;
>> +
>>    // All other instructions count for at least one unit.
>>    ++Size;
>> 
>> @@ -578,11 +576,7 @@ static unsigned getJumpThreadDuplicationCost(BasicBlock 
>> *BB,
>>    // as having cost of 2 total, and if they are a vector intrinsic, we model
>>    // them as having cost 1.
>>    if (const CallInst *CI = dyn_cast<CallInst>(I)) {
>> -      if (CI->cannotDuplicate() || CI->isConvergent())
>> -        // Blocks with NoDuplicate are modelled as having infinite cost, so 
>> they
>> -        // are never duplicated.
>> -        return ~0U;
>> -      else if (!isa<IntrinsicInst>(CI))
>> +      if (!isa<IntrinsicInst>(CI))
>>        Size += 3;
>>      else if (!CI->getType()->isVectorTy())
>>        Size += 1;
>> @@ -2234,10 +2228,10 @@ bool 
>> JumpThreadingPass::maybethreadThroughTwoBasicBlocks(BasicBlock *BB,
>>  }
>> 
>>  // Compute the cost of duplicating BB and PredBB.
>> -  unsigned BBCost =
>> -      getJumpThreadDuplicationCost(BB, BB->getTerminator(), BBDupThreshold);
>> +  unsigned BBCost = getJumpThreadDuplicationCost(
>> +      TTI, BB, BB->getTerminator(), BBDupThreshold);
>>  unsigned PredBBCost = getJumpThreadDuplicationCost(
>> -      PredBB, PredBB->getTerminator(), BBDupThreshold);
>> +      TTI, PredBB, PredBB->getTerminator(), BBDupThreshold);
>> 
>>  // Give up if costs are too high.  We need to check BBCost and PredBBCost
>>  // individually before checking their sum because 
>> getJumpThreadDuplicationCost
>> @@ -2345,8 +2339,8 @@ bool JumpThreadingPass::tryThreadEdge(
>>    return false;
>>  }
>> 
>> -  unsigned JumpThreadCost =
>> -      getJumpThreadDuplicationCost(BB, BB->getTerminator(), BBDupThreshold);
>> +  unsigned JumpThreadCost = getJumpThreadDuplicationCost(
>> +      TTI, BB, BB->getTerminator(), BBDupThreshold);
>>  if (JumpThreadCost > BBDupThreshold) {
>>    LLVM_DEBUG(dbgs() << "  Not threading BB '" << BB->getName()
>>                      << "' - Cost is too high: " << JumpThreadCost << "\n");
>> @@ -2614,8 +2608,8 @@ bool 
>> JumpThreadingPass::duplicateCondBranchOnPHIIntoPred(
>>    return false;
>>  }
>> 
>> -  unsigned DuplicationCost =
>> -      getJumpThreadDuplicationCost(BB, BB->getTerminator(), BBDupThreshold);
>> +  unsigned DuplicationCost = getJumpThreadDuplicationCost(
>> +      TTI, BB, BB->getTerminator(), BBDupThreshold);
>>  if (DuplicationCost > BBDupThreshold) {
>>    LLVM_DEBUG(dbgs() << "  Not duplicating BB '" << BB->getName()
>>                      << "' - Cost is too high: " << DuplicationCost << "\n");
>> @@ -3031,7 +3025,8 @@ bool JumpThreadingPass::threadGuard(BasicBlock *BB, 
>> IntrinsicInst *Guard,
>> 
>>  ValueToValueMapTy UnguardedMapping, GuardedMapping;
>>  Instruction *AfterGuard = Guard->getNextNode();
>> -  unsigned Cost = getJumpThreadDuplicationCost(BB, AfterGuard, 
>> BBDupThreshold);
>> +  unsigned Cost =
>> +      getJumpThreadDuplicationCost(TTI, BB, AfterGuard, BBDupThreshold);
>>  if (Cost > BBDupThreshold)
>>    return false;
>>  // Duplicate all instructions before the guard and the guard itself to the
>> diff --git a/llvm/test/Transforms/JumpThreading/free_instructions.ll 
>> b/llvm/test/Transforms/JumpThreading/free_instructions.ll
>> index f768ec996779..76392af77d33 100644
>> --- a/llvm/test/Transforms/JumpThreading/free_instructions.ll
>> +++ b/llvm/test/Transforms/JumpThreading/free_instructions.ll
>> @@ -5,26 +5,28 @@
>> ; the jump threading threshold, as everything else are free instructions.
>> define i32 @free_instructions(i1 %c, i32* %p) {
>> ; CHECK-LABEL: @free_instructions(
>> -; CHECK-NEXT:    br i1 [[C:%.*]], label [[IF:%.*]], label [[ELSE:%.*]]
>> -; CHECK:       if:
>> +; CHECK-NEXT:    br i1 [[C:%.*]], label [[IF2:%.*]], label [[ELSE2:%.*]]
>> +; CHECK:       if2:
>> ; CHECK-NEXT:    store i32 -1, i32* [[P:%.*]], align 4
>> -; CHECK-NEXT:    br label [[JOIN:%.*]]
>> -; CHECK:       else:
>> -; CHECK-NEXT:    store i32 -2, i32* [[P]], align 4
>> -; CHECK-NEXT:    br label [[JOIN]]
>> -; CHECK:       join:
>> ; CHECK-NEXT:    call void @llvm.experimental.noalias.scope.decl(metadata 
>> [[META0:![0-9]+]])
>> ; CHECK-NEXT:    store i32 1, i32* [[P]], align 4, !noalias !0
>> ; CHECK-NEXT:    call void @llvm.assume(i1 true) [ "align"(i32* [[P]], i64 
>> 32) ]
>> ; CHECK-NEXT:    store i32 2, i32* [[P]], align 4
>> +; CHECK-NEXT:    [[P21:%.*]] = bitcast i32* [[P]] to i8*
>> +; CHECK-NEXT:    [[P32:%.*]] = call i8* 
>> @llvm.launder.invariant.group.p0i8(i8* [[P21]])
>> +; CHECK-NEXT:    [[P43:%.*]] = bitcast i8* [[P32]] to i32*
>> +; CHECK-NEXT:    store i32 3, i32* [[P43]], align 4, !invariant.group !3
>> +; CHECK-NEXT:    ret i32 0
>> +; CHECK:       else2:
>> +; CHECK-NEXT:    store i32 -2, i32* [[P]], align 4
>> +; CHECK-NEXT:    call void @llvm.experimental.noalias.scope.decl(metadata 
>> [[META4:![0-9]+]])
>> +; CHECK-NEXT:    store i32 1, i32* [[P]], align 4, !noalias !4
>> +; CHECK-NEXT:    call void @llvm.assume(i1 true) [ "align"(i32* [[P]], i64 
>> 32) ]
>> +; CHECK-NEXT:    store i32 2, i32* [[P]], align 4
>> ; CHECK-NEXT:    [[P2:%.*]] = bitcast i32* [[P]] to i8*
>> ; CHECK-NEXT:    [[P3:%.*]] = call i8* 
>> @llvm.launder.invariant.group.p0i8(i8* [[P2]])
>> ; CHECK-NEXT:    [[P4:%.*]] = bitcast i8* [[P3]] to i32*
>> ; CHECK-NEXT:    store i32 3, i32* [[P4]], align 4, !invariant.group !3
>> -; CHECK-NEXT:    br i1 [[C]], label [[IF2:%.*]], label [[ELSE2:%.*]]
>> -; CHECK:       if2:
>> -; CHECK-NEXT:    ret i32 0
>> -; CHECK:       else2:
>> ; CHECK-NEXT:    ret i32 1
>> ;
>>  br i1 %c, label %if, label %else
>> diff --git 
>> a/llvm/test/Transforms/PhaseOrdering/inlining-alignment-assumptions.ll 
>> b/llvm/test/Transforms/PhaseOrdering/inlining-alignment-assumptions.ll
>> index 57014e856a09..f764a59dd8a2 100644
>> --- a/llvm/test/Transforms/PhaseOrdering/inlining-alignment-assumptions.ll
>> +++ b/llvm/test/Transforms/PhaseOrdering/inlining-alignment-assumptions.ll
>> @@ -32,13 +32,10 @@ define void @caller1(i1 %c, i64* align 1 %ptr) {
>> ; ASSUMPTIONS-OFF-NEXT:    br label [[COMMON_RET]]
>> ;
>> ; ASSUMPTIONS-ON-LABEL: @caller1(
>> -; ASSUMPTIONS-ON-NEXT:    br i1 [[C:%.*]], label [[COMMON_RET:%.*]], label 
>> [[FALSE1:%.*]]
>> -; ASSUMPTIONS-ON:       false1:
>> -; ASSUMPTIONS-ON-NEXT:    store volatile i64 1, i64* [[PTR:%.*]], align 4
>> -; ASSUMPTIONS-ON-NEXT:    br label [[COMMON_RET]]
>> +; ASSUMPTIONS-ON-NEXT:    br i1 [[C:%.*]], label [[COMMON_RET:%.*]], label 
>> [[FALSE2:%.*]]
>> ; ASSUMPTIONS-ON:       common.ret:
>> -; ASSUMPTIONS-ON-NEXT:    [[DOTSINK:%.*]] = phi i64 [ 3, [[FALSE1]] ], [ 2, 
>> [[TMP0:%.*]] ]
>> -; ASSUMPTIONS-ON-NEXT:    call void @llvm.assume(i1 true) [ "align"(i64* 
>> [[PTR]], i64 8) ]
>> +; ASSUMPTIONS-ON-NEXT:    [[DOTSINK:%.*]] = phi i64 [ 3, [[FALSE2]] ], [ 2, 
>> [[TMP0:%.*]] ]
>> +; ASSUMPTIONS-ON-NEXT:    call void @llvm.assume(i1 true) [ "align"(i64* 
>> [[PTR:%.*]], i64 8) ]
>> ; ASSUMPTIONS-ON-NEXT:    store volatile i64 0, i64* [[PTR]], align 8
>> ; ASSUMPTIONS-ON-NEXT:    store volatile i64 -1, i64* [[PTR]], align 8
>> ; ASSUMPTIONS-ON-NEXT:    store volatile i64 -1, i64* [[PTR]], align 8
>> @@ -47,6 +44,9 @@ define void @caller1(i1 %c, i64* align 1 %ptr) {
>> ; ASSUMPTIONS-ON-NEXT:    store volatile i64 -1, i64* [[PTR]], align 8
>> ; ASSUMPTIONS-ON-NEXT:    store volatile i64 [[DOTSINK]], i64* [[PTR]], 
>> align 8
>> ; ASSUMPTIONS-ON-NEXT:    ret void
>> +; ASSUMPTIONS-ON:       false2:
>> +; ASSUMPTIONS-ON-NEXT:    store volatile i64 1, i64* [[PTR]], align 4
>> +; ASSUMPTIONS-ON-NEXT:    br label [[COMMON_RET]]
>> ;
>>  br i1 %c, label %true1, label %false1
>> 
>> </cut>
> 

_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain

Re: [TCWG CI] 462.libquantum grew in size by 3% after llvm: [JumpThreading] Ignore free instructions

Reply via email to