Successfully identified regression in *llvm* in CI configuration 
tcwg_bmk_llvm_tk1/llvm-master-arm-spec2k6-O2.  So far, this commit has 
regressed CI configurations:
 - tcwg_bmk_llvm_tk1/llvm-master-arm-spec2k6-O2

Culprit:
<cut>
commit d181fd918d18cbd99768f025e14a69d35d275f14
Author: Simon Pilgrim <llvm-...@redking.me.uk>
Date:   Fri Jul 2 14:27:27 2021 +0100

    [CostModel][X86] Drop some hard coded fp<->int scalarization costs
    
    Scalarization costs handling is a lot better now, and the hard coded costs 
were higher than the worse case numbers from the script in D103695
</cut>

Results regressed to (for first_bad == d181fd918d18cbd99768f025e14a69d35d275f14)
# reset_artifacts:
-10
# build_abe binutils:
-9
# build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set 
gcc_override_configure=--disable-libsanitizer:
-8
# build_abe linux:
-7
# build_abe glibc:
-6
# build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set 
gcc_override_configure=--disable-libsanitizer:
-5
# build_llvm true:
-3
# true:
0
# benchmark -O2_marm -- 
artifacts/build-d181fd918d18cbd99768f025e14a69d35d275f14/results_id:
1
# 400.perlbench,libc-2.33.9000.so                               regressed by 113

from (for last_good == 5df556ac8bb8c5f4ef3dff1a2039dd389d1d27c0)
# reset_artifacts:
-10
# build_abe binutils:
-9
# build_abe stage1 -- --set gcc_override_configure=--with-mode=arm --set 
gcc_override_configure=--disable-libsanitizer:
-8
# build_abe linux:
-7
# build_abe glibc:
-6
# build_abe stage2 -- --set gcc_override_configure=--with-mode=arm --set 
gcc_override_configure=--disable-libsanitizer:
-5
# build_llvm true:
-3
# true:
0
# benchmark -O2_marm -- 
artifacts/build-5df556ac8bb8c5f4ef3dff1a2039dd389d1d27c0/results_id:
1

Artifacts of last_good build: 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2/10/artifact/artifacts/build-5df556ac8bb8c5f4ef3dff1a2039dd389d1d27c0/
Results ID of last_good: 
tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-master-arm-spec2k6-O2/1840
Artifacts of first_bad build: 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2/10/artifact/artifacts/build-d181fd918d18cbd99768f025e14a69d35d275f14/
Results ID of first_bad: 
tk1_32/tcwg_bmk_llvm_tk1/bisect-llvm-master-arm-spec2k6-O2/1837
Build top page/logs: 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2/10/

Configuration details:


Reproduce builds:
<cut>
mkdir investigate-llvm-d181fd918d18cbd99768f025e14a69d35d275f14
cd investigate-llvm-d181fd918d18cbd99768f025e14a69d35d275f14

git clone https://git.linaro.org/toolchain/jenkins-scripts

mkdir -p artifacts/manifests
curl -o artifacts/manifests/build-baseline.sh 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2/10/artifact/artifacts/manifests/build-baseline.sh
 --fail
curl -o artifacts/manifests/build-parameters.sh 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2/10/artifact/artifacts/manifests/build-parameters.sh
 --fail
curl -o artifacts/test.sh 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2/10/artifact/artifacts/test.sh
 --fail
chmod +x artifacts/test.sh

# Reproduce the baseline build (build all pre-requisites)
./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh

# Save baseline build state (which is then restored in artifacts/test.sh)
rsync -a --del --delete-excluded --exclude bisect/ --exclude artifacts/ 
--exclude llvm/ ./ ./bisect/baseline/

cd llvm

# Reproduce first_bad build
git checkout --detach d181fd918d18cbd99768f025e14a69d35d275f14
../artifacts/test.sh

# Reproduce last_good build
git checkout --detach 5df556ac8bb8c5f4ef3dff1a2039dd389d1d27c0
../artifacts/test.sh

cd ..
</cut>

History of pending regressions and results: 
https://git.linaro.org/toolchain/ci/base-artifacts.git/log/?h=linaro-local/ci/tcwg_bmk_llvm_tk1/llvm-master-arm-spec2k6-O2

Artifacts: 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2/10/artifact/artifacts/
Build log: 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2/10/consoleText

Full commit (up to 1000 lines):
<cut>
commit d181fd918d18cbd99768f025e14a69d35d275f14
Author: Simon Pilgrim <llvm-...@redking.me.uk>
Date:   Fri Jul 2 14:27:27 2021 +0100

    [CostModel][X86] Drop some hard coded fp<->int scalarization costs
    
    Scalarization costs handling is a lot better now, and the hard coded costs 
were higher than the worse case numbers from the script in D103695
---
 llvm/lib/Target/X86/X86TargetTransformInfo.cpp | 13 -------------
 llvm/test/Analysis/CostModel/X86/sitofp.ll     |  6 +++---
 2 files changed, 3 insertions(+), 16 deletions(-)

diff --git a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp 
b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
index d55cd8a8c7a8..9eb5abe4dd9b 100644
--- a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
+++ b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
@@ -1977,13 +1977,6 @@ InstructionCost X86TTIImpl::getCastInstrCost(unsigned 
Opcode, Type *Dst,
     { ISD::UINT_TO_FP,  MVT::v8f64, MVT::v8i32, 10 },
     { ISD::UINT_TO_FP,  MVT::v2f64, MVT::v2i64, 5 },
     { ISD::UINT_TO_FP,  MVT::v4f64, MVT::v4i64, 6 },
-    // The generic code to compute the scalar overhead is currently broken.
-    // Workaround this limitation by estimating the scalarization overhead
-    // here. We have roughly 10 instructions per scalar element.
-    // Multiply that by the vector width.
-    // FIXME: remove that when PR19268 is fixed.
-    { ISD::SINT_TO_FP,  MVT::v4f64, MVT::v4i64, 13 },
-    { ISD::SINT_TO_FP,  MVT::v4f64, MVT::v4i64, 13 },
 
     { ISD::FP_TO_SINT,  MVT::v8i8,  MVT::v8f32, 4 },
     { ISD::FP_TO_SINT,  MVT::v4i8,  MVT::v4f64, 3 },
@@ -2003,12 +1996,6 @@ InstructionCost X86TTIImpl::getCastInstrCost(unsigned 
Opcode, Type *Dst,
     { ISD::FP_TO_UINT,  MVT::v8i16, MVT::v8f32, 3 },
     { ISD::FP_TO_UINT,  MVT::v8i32, MVT::v8f32, 9 },
     { ISD::FP_TO_UINT,  MVT::v8i32, MVT::v8f64, 19 },
-    // This node is expanded into scalarized operations but BasicTTI is overly
-    // optimistic estimating its cost.  It computes 3 per element (one
-    // vector-extract, one scalar conversion and one vector-insert).  The
-    // problem is that the inserts form a read-modify-write chain so latency
-    // should be factored in too.  Inflating the cost per element by 1.
-    { ISD::FP_TO_UINT,  MVT::v4i32, MVT::v4f64, 4*4 },
 
     { ISD::FP_EXTEND,   MVT::v4f64,  MVT::v4f32,  1 },
     { ISD::FP_ROUND,    MVT::v4f32,  MVT::v4f64,  1 },
diff --git a/llvm/test/Analysis/CostModel/X86/sitofp.ll 
b/llvm/test/Analysis/CostModel/X86/sitofp.ll
index b3c400c93b9f..b327454c1d09 100644
--- a/llvm/test/Analysis/CostModel/X86/sitofp.ll
+++ b/llvm/test/Analysis/CostModel/X86/sitofp.ll
@@ -122,14 +122,14 @@ define i32 @sitofp_i64_double() {
 ; AVX-LABEL: 'sitofp_i64_double'
 ; AVX-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: 
%cvt_i64_f64 = sitofp i64 undef to double
 ; AVX-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: 
%cvt_v2i64_v2f64 = sitofp <2 x i64> undef to <2 x double>
-; AVX-NEXT:  Cost Model: Found an estimated cost of 13 for instruction: 
%cvt_v4i64_v4f64 = sitofp <4 x i64> undef to <4 x double>
-; AVX-NEXT:  Cost Model: Found an estimated cost of 26 for instruction: 
%cvt_v8i64_v8f64 = sitofp <8 x i64> undef to <8 x double>
+; AVX-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: 
%cvt_v4i64_v4f64 = sitofp <4 x i64> undef to <4 x double>
+; AVX-NEXT:  Cost Model: Found an estimated cost of 22 for instruction: 
%cvt_v8i64_v8f64 = sitofp <8 x i64> undef to <8 x double>
 ; AVX-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret i32 
undef
 ;
 ; AVX512F-LABEL: 'sitofp_i64_double'
 ; AVX512F-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: 
%cvt_i64_f64 = sitofp i64 undef to double
 ; AVX512F-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: 
%cvt_v2i64_v2f64 = sitofp <2 x i64> undef to <2 x double>
-; AVX512F-NEXT:  Cost Model: Found an estimated cost of 13 for instruction: 
%cvt_v4i64_v4f64 = sitofp <4 x i64> undef to <4 x double>
+; AVX512F-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: 
%cvt_v4i64_v4f64 = sitofp <4 x i64> undef to <4 x double>
 ; AVX512F-NEXT:  Cost Model: Found an estimated cost of 25 for instruction: 
%cvt_v8i64_v8f64 = sitofp <8 x i64> undef to <8 x double>
 ; AVX512F-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret 
i32 undef
 ;
</cut>
_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/linaro-toolchain

Reply via email to