Hi David, Linaro’s benchmarking CI is churning through a backlog of old commits, and it appears that your patch increases code-size of a single function by 13% — from 5890 bytes to 6642 bytes — that’s on Thumb2 at -Os. Not a big deal, but may be it is exposing a case worth investigating.
The function is BZ2_compressBlock() from SPEC CPU2006’s 401.bzip2. Let us know if you are interested in investigating it, and we’ll provide details to reproduce. Regards, -- Maxim Kuvyrkov https://www.linaro.org > On 19 Jul 2021, at 00:39, ci_not...@linaro.org wrote: > > Successfully identified regression in *llvm* in CI configuration > tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-Os. So far, this commit has > regressed CI configurations: > - tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-Os > > Culprit: > <cut> > commit ab97c9bdb747c873cd35a18229e2694156a7607d > Author: David Green <david.gr...@arm.com> > Date: Sat Dec 12 14:21:40 2020 +0000 > > [LV] Fix scalar cost for tail predicated loops > > When it comes to the scalar cost of any predicated block, the loop > vectorizer by default regards this predication as a sign that it is > looking at an if-conversion and divides the scalar cost of the block by > 2, assuming it would only be executed half the time. This however makes > no sense if the predication has been introduced to tail predicate the > loop. > > Original patch by Anna Welker > > Differential Revision: https://reviews.llvm.org/D86452 > </cut> > > Results regressed to (for first_bad == > ab97c9bdb747c873cd35a18229e2694156a7607d) > # reset_artifacts: > -10 > # build_abe binutils: > -9 > # build_abe stage1 -- --set gcc_override_configure=--with-mode=thumb --set > gcc_override_configure=--disable-libsanitizer: > -8 > # build_abe linux: > -7 > # build_abe glibc: > -6 > # build_abe stage2 -- --set gcc_override_configure=--with-mode=thumb --set > gcc_override_configure=--disable-libsanitizer: > -5 > # build_llvm true: > -3 > # true: > 0 > # benchmark -Os_mthumb -- > artifacts/build-ab97c9bdb747c873cd35a18229e2694156a7607d/results_id: > 1 > # 401.bzip2,bzip2_base.default regressed by > 103 > # 401.bzip2,[.] BZ2_compressBlock regressed by > 113 > # 473.astar,astar_base.default regressed by > 103 > > from (for last_good == d716eab197abec0b9aab4a76cd1a52b248b8c3b1) > # reset_artifacts: > -10 > # build_abe binutils: > -9 > # build_abe stage1 -- --set gcc_override_configure=--with-mode=thumb --set > gcc_override_configure=--disable-libsanitizer: > -8 > # build_abe linux: > -7 > # build_abe glibc: > -6 > # build_abe stage2 -- --set gcc_override_configure=--with-mode=thumb --set > gcc_override_configure=--disable-libsanitizer: > -5 > # build_llvm true: > -3 > # true: > 0 > # benchmark -Os_mthumb -- > artifacts/build-d716eab197abec0b9aab4a76cd1a52b248b8c3b1/results_id: > 1 > > Artifacts of last_good build: > https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release-arm-spec2k6-Os/19/artifact/artifacts/build-d716eab197abec0b9aab4a76cd1a52b248b8c3b1/ > Results ID of last_good: > tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-Os/1878 > Artifacts of first_bad build: > https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release-arm-spec2k6-Os/19/artifact/artifacts/build-ab97c9bdb747c873cd35a18229e2694156a7607d/ > Results ID of first_bad: > tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-release-arm-spec2k6-Os/1876 > Build top page/logs: > https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release-arm-spec2k6-Os/19/ > > Configuration details: > > > Reproduce builds: > <cut> > mkdir investigate-llvm-ab97c9bdb747c873cd35a18229e2694156a7607d > cd investigate-llvm-ab97c9bdb747c873cd35a18229e2694156a7607d > > git clone https://git.linaro.org/toolchain/jenkins-scripts > > mkdir -p artifacts/manifests > curl -o artifacts/manifests/build-baseline.sh > https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release-arm-spec2k6-Os/19/artifact/artifacts/manifests/build-baseline.sh > --fail > curl -o artifacts/manifests/build-parameters.sh > https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release-arm-spec2k6-Os/19/artifact/artifacts/manifests/build-parameters.sh > --fail > curl -o artifacts/test.sh > https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release-arm-spec2k6-Os/19/artifact/artifacts/test.sh > --fail > chmod +x artifacts/test.sh > > # Reproduce the baseline build (build all pre-requisites) > ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh > > # Save baseline build state (which is then restored in artifacts/test.sh) > rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ > --exclude llvm/ ./ ./bisect/baseline/ > > cd llvm > > # Reproduce first_bad build > git checkout --detach ab97c9bdb747c873cd35a18229e2694156a7607d > ../artifacts/test.sh > > # Reproduce last_good build > git checkout --detach d716eab197abec0b9aab4a76cd1a52b248b8c3b1 > ../artifacts/test.sh > > cd .. > </cut> > > History of pending regressions and results: > https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/ci/tcwg_bmk_llvm_tk1/llvm-release-arm-spec2k6-Os > > Artifacts: > https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release-arm-spec2k6-Os/19/artifact/artifacts/ > Build log: > https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-release-arm-spec2k6-Os/19/consoleText > > Full commit (up to 1000 lines): > <cut> > commit ab97c9bdb747c873cd35a18229e2694156a7607d > Author: David Green <david.gr...@arm.com> > Date: Sat Dec 12 14:21:40 2020 +0000 > > [LV] Fix scalar cost for tail predicated loops > > When it comes to the scalar cost of any predicated block, the loop > vectorizer by default regards this predication as a sign that it is > looking at an if-conversion and divides the scalar cost of the block by > 2, assuming it would only be executed half the time. This however makes > no sense if the predication has been introduced to tail predicate the > loop. > > Original patch by Anna Welker > > Differential Revision: https://reviews.llvm.org/D86452 > --- > llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | 7 ++++--- > llvm/test/Transforms/LoopVectorize/ARM/scalar-block-cost.ll | 2 +- > 2 files changed, 5 insertions(+), 4 deletions(-) > > diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp > b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp > index c381377b67c9..663ea50c4c02 100644 > --- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp > +++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp > @@ -6483,9 +6483,10 @@ LoopVectorizationCostModel::expectedCost(ElementCount > VF) { > // if-converted. This means that the block's instructions (aside from > // stores and instructions that may divide by zero) will now be > // unconditionally executed. For the scalar case, we may not always > execute > - // the predicated block. Thus, scale the block's cost by the probability > of > - // executing it. > - if (VF.isScalar() && blockNeedsPredication(BB)) > + // the predicated block, if it is an if-else block. Thus, scale the > block's > + // cost by the probability of executing it. blockNeedsPredication from > + // Legal is used so as to not include all blocks in tail folded loops. > + if (VF.isScalar() && Legal->blockNeedsPredication(BB)) > BlockCost.first /= getReciprocalPredBlockProb(); > > Cost.first += BlockCost.first; > diff --git a/llvm/test/Transforms/LoopVectorize/ARM/scalar-block-cost.ll > b/llvm/test/Transforms/LoopVectorize/ARM/scalar-block-cost.ll > index 959fbe676e67..fc8ea4fc938c 100644 > --- a/llvm/test/Transforms/LoopVectorize/ARM/scalar-block-cost.ll > +++ b/llvm/test/Transforms/LoopVectorize/ARM/scalar-block-cost.ll > @@ -15,7 +15,7 @@ define void @pred_loop(i32* %off, i32* %data, i32* %dst, > i32 %n) #0 { > ; CHECK-COST-NEXT: LV: Found an estimated cost of 1 for VF 1 For instruction: > store i32 %add1, i32* %arrayidx2, align 4 > ; CHECK-COST-NEXT: LV: Found an estimated cost of 1 for VF 1 For instruction: > %exitcond.not = icmp eq i32 %add, %n > ; CHECK-COST-NEXT: LV: Found an estimated cost of 0 for VF 1 For instruction: > br i1 %exitcond.not, label %exit.loopexit, label %for.body > -; CHECK-COST-NEXT: LV: Scalar loop costs: 2. > +; CHECK-COST-NEXT: LV: Scalar loop costs: 5. > > entry: > %cmp8 = icmp sgt i32 %n, 0 > </cut> _______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-toolchain