[llvm-branch-commits] [llvm] 202d359 - [X86] Add the FSRM feature (Fast Short Rep Mov) to Zen3.

2021-01-14 Thread Hiroshi Yamauchi via llvm-branch-commits

Author: Hiroshi Yamauchi
Date: 2021-01-14T10:47:33-08:00
New Revision: 202d359753d1f130a228c3ad52dfaabf384250d1

URL: 
https://github.com/llvm/llvm-project/commit/202d359753d1f130a228c3ad52dfaabf384250d1
DIFF: 
https://github.com/llvm/llvm-project/commit/202d359753d1f130a228c3ad52dfaabf384250d1.diff

LOG: [X86] Add the FSRM feature (Fast Short Rep Mov) to Zen3.

Note -x86-use-fsrm-for-memcpy is still disabled by default and there's no
default behavior change.

Differential Revision: https://reviews.llvm.org/D94436

Added: 


Modified: 
llvm/lib/Target/X86/X86.td
llvm/test/CodeGen/X86/memcpy-inline-fsrm.ll

Removed: 




diff  --git a/llvm/lib/Target/X86/X86.td b/llvm/lib/Target/X86/X86.td
index 9096d9d54452..c492d686c52e 100644
--- a/llvm/lib/Target/X86/X86.td
+++ b/llvm/lib/Target/X86/X86.td
@@ -1071,7 +1071,8 @@ def ProcessorFeatures {
   list ZN2Tuning = ZNTuning;
   list ZN2Features =
 !listconcat(ZNFeatures, ZN2AdditionalFeatures);
-  list ZN3AdditionalFeatures = [FeatureINVPCID,
+  list ZN3AdditionalFeatures = [FeatureFSRM,
+  FeatureINVPCID,
   FeaturePKU,
   FeatureVAES,
   FeatureVPCLMULQDQ];

diff  --git a/llvm/test/CodeGen/X86/memcpy-inline-fsrm.ll 
b/llvm/test/CodeGen/X86/memcpy-inline-fsrm.ll
index 9480d74723fc..77e97626b1c6 100644
--- a/llvm/test/CodeGen/X86/memcpy-inline-fsrm.ll
+++ b/llvm/test/CodeGen/X86/memcpy-inline-fsrm.ll
@@ -4,6 +4,7 @@
 ; RUN: llc -mtriple=x86_64-linux-gnu -x86-use-fsrm-for-memcpy -mcpu=haswell < 
%s | FileCheck %s --check-prefix=NOFSRM
 ; RUN: llc -mtriple=x86_64-linux-gnu -x86-use-fsrm-for-memcpy 
-mcpu=icelake-client < %s | FileCheck %s --check-prefix=FSRM
 ; RUN: llc -mtriple=x86_64-linux-gnu -x86-use-fsrm-for-memcpy 
-mcpu=icelake-server < %s | FileCheck %s --check-prefix=FSRM
+; RUN: llc -mtriple=x86_64-linux-gnu -x86-use-fsrm-for-memcpy -mcpu=znver3 < 
%s | FileCheck %s --check-prefix=FSRM
 
 declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture, i8* nocapture, i64, i1) 
nounwind
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] cf5415c - [PGO][PGSO] Let unroll hints take precedence over PGSO.

2021-01-07 Thread Hiroshi Yamauchi via llvm-branch-commits

Author: Hiroshi Yamauchi
Date: 2021-01-07T10:10:31-08:00
New Revision: cf5415c727dda0ea4b27ee16347d170f118b037b

URL: 
https://github.com/llvm/llvm-project/commit/cf5415c727dda0ea4b27ee16347d170f118b037b
DIFF: 
https://github.com/llvm/llvm-project/commit/cf5415c727dda0ea4b27ee16347d170f118b037b.diff

LOG: [PGO][PGSO] Let unroll hints take precedence over PGSO.

Differential Revision: https://reviews.llvm.org/D94199

Added: 


Modified: 
llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
llvm/test/Transforms/LoopUnroll/unroll-opt-attribute.ll

Removed: 




diff  --git a/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp 
b/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
index 4cce05d595a8..d09f1ee22a75 100644
--- a/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopUnrollPass.cpp
@@ -218,8 +218,10 @@ TargetTransformInfo::UnrollingPreferences 
llvm::gatherUnrollingPreferences(
 
   // Apply size attributes
   bool OptForSize = L->getHeader()->getParent()->hasOptSize() ||
-llvm::shouldOptimizeForSize(L->getHeader(), PSI, BFI,
-PGSOQueryType::IRPass);
+// Let unroll hints / pragmas take precedence over PGSO.
+(hasUnrollTransformation(L) != TM_ForcedByUser &&
+ llvm::shouldOptimizeForSize(L->getHeader(), PSI, BFI,
+ PGSOQueryType::IRPass));
   if (OptForSize) {
 UP.Threshold = UP.OptSizeThreshold;
 UP.PartialThreshold = UP.PartialOptSizeThreshold;

diff  --git a/llvm/test/Transforms/LoopUnroll/unroll-opt-attribute.ll 
b/llvm/test/Transforms/LoopUnroll/unroll-opt-attribute.ll
index e219349fb093..884981bf8bf0 100644
--- a/llvm/test/Transforms/LoopUnroll/unroll-opt-attribute.ll
+++ b/llvm/test/Transforms/LoopUnroll/unroll-opt-attribute.ll
@@ -158,6 +158,38 @@ for.end:  ; preds 
= %for.body
 ; NPGSO-NOT:  phi
 ; NPGSO-NOT:  icmp
 
+;/ TEST 6 //
+
+; This test tests that unroll hints take precedence over PGSO and that this 
loop
+; gets unrolled even though it's cold.
+
+define i32 @Test6() !prof !14 {
+entry:
+  br label %for.body
+
+for.body: ; preds = %for.body, %entry
+  %i.05 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
+  %arrayidx = getelementptr inbounds [24 x i32], [24 x i32]* @tab, i32 0, i32 
%i.05
+  store i32 %i.05, i32* %arrayidx, align 4
+  %inc = add nuw nsw i32 %i.05, 1
+  %exitcond = icmp eq i32 %inc, 24
+  br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !15
+
+for.end:  ; preds = %for.body
+  ret i32 42
+}
+
+; PGSO-LABEL: @Test6
+; PGSO:  store
+; PGSO:  store
+; PGSO:  store
+; PGSO:  store
+; NPGSO-LABEL: @Test6
+; NPGSO:  store
+; NPGSO:  store
+; NPGSO:  store
+; NPGSO:  store
+
 !llvm.module.flags = !{!0}
 !0 = !{i32 1, !"ProfileSummary", !1}
 !1 = !{!2, !3, !4, !5, !6, !7, !8, !9}
@@ -174,3 +206,5 @@ for.end:  ; preds = 
%for.body
 !12 = !{i32 999000, i64 100, i32 1}
 !13 = !{i32 99, i64 1, i32 2}
 !14 = !{!"function_entry_count", i64 0}
+!15 = !{!15, !16}
+!16 = !{!"llvm.loop.unroll.count", i32 4}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] f9c3954 - Fix for Bug 48055.

2020-12-04 Thread Hiroshi Yamauchi via llvm-branch-commits

Author: Hiroshi Yamauchi
Date: 2020-12-04T11:05:01-08:00
New Revision: f9c3954a6ec5d1066a67aadab848f02b9b78b056

URL: 
https://github.com/llvm/llvm-project/commit/f9c3954a6ec5d1066a67aadab848f02b9b78b056
DIFF: 
https://github.com/llvm/llvm-project/commit/f9c3954a6ec5d1066a67aadab848f02b9b78b056.diff

LOG: Fix for Bug 48055.

Differential Revision: https://reviews.llvm.org/D92599

Added: 
llvm/test/Transforms/GlobalOpt/evaluate-bitcast-4.ll

Modified: 
llvm/lib/Transforms/Utils/Evaluator.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Utils/Evaluator.cpp 
b/llvm/lib/Transforms/Utils/Evaluator.cpp
index 6fe29381c71a..a01dc7a4e609 100644
--- a/llvm/lib/Transforms/Utils/Evaluator.cpp
+++ b/llvm/lib/Transforms/Utils/Evaluator.cpp
@@ -183,11 +183,11 @@ evaluateBitcastFromPtr(Constant *Ptr, const DataLayout 
&DL,
std::function Func) {
   Constant *Val;
   while (!(Val = Func(Ptr))) {
-// If Ty is a struct, we can convert the pointer to the struct
+// If Ty is a non-opaque struct, we can convert the pointer to the struct
 // into a pointer to its first member.
 // FIXME: This could be extended to support arrays as well.
 Type *Ty = cast(Ptr->getType())->getElementType();
-if (!isa(Ty))
+if (!isa(Ty) || cast(Ty)->isOpaque())
   break;
 
 IntegerType *IdxTy = IntegerType::get(Ty->getContext(), 32);

diff  --git a/llvm/test/Transforms/GlobalOpt/evaluate-bitcast-4.ll 
b/llvm/test/Transforms/GlobalOpt/evaluate-bitcast-4.ll
new file mode 100644
index ..1e7c89894d2a
--- /dev/null
+++ b/llvm/test/Transforms/GlobalOpt/evaluate-bitcast-4.ll
@@ -0,0 +1,29 @@
+; PR48055. Check that this does not crash.
+; RUN: opt -globalopt %s -disable-output
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+%struct.g = type opaque
+%struct.a = type { i32 (...)** }
+
+@l = dso_local global i32 0, align 4
+@h = external dso_local global %struct.g, align 1
+@llvm.global_ctors = appending global [1 x { i32, void ()*, i8* }] [{ i32, 
void ()*, i8* } { i32 65535, void ()* @_GLOBAL__sub_I_bug48055.cc, i8* null }]
+
+; Function Attrs: uwtable
+define internal void @__cxx_global_var_init() {
+entry:
+  %vtable = load i32 (%struct.a*)**, i32 (%struct.a*)*** bitcast (%struct.g* 
@h to i32 (%struct.a*)***), align 1
+  %0 = load i32 (%struct.a*)*, i32 (%struct.a*)** %vtable, align 8
+  %call = call i32 %0(%struct.a* nonnull dereferenceable(8) bitcast 
(%struct.g* @h to %struct.a*))
+  store i32 %call, i32* @l, align 4
+  ret void
+}
+
+; Function Attrs: uwtable
+define internal void @_GLOBAL__sub_I_bug48055.cc() {
+entry:
+  call void @__cxx_global_var_init()
+  ret void
+}
+



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] a5f5612 - [PGO] Adjust -vp-counters-per-site under dynamic linking.

2020-12-11 Thread Hiroshi Yamauchi via llvm-branch-commits

Author: Hiroshi Yamauchi
Date: 2020-12-11T09:42:53-08:00
New Revision: a5f5612263ca650ae0aa433539278c2d5f567cf8

URL: 
https://github.com/llvm/llvm-project/commit/a5f5612263ca650ae0aa433539278c2d5f567cf8
DIFF: 
https://github.com/llvm/llvm-project/commit/a5f5612263ca650ae0aa433539278c2d5f567cf8.diff

LOG: [PGO] Adjust -vp-counters-per-site under dynamic linking.

Addressing clang bootstrap under the dynamic linking mode running out of static
allocation of value profile nodes, reported in D81682.

Differential Revision: https://reviews.llvm.org/D92669

Added: 


Modified: 
llvm/cmake/modules/HandleLLVMOptions.cmake

Removed: 




diff  --git a/llvm/cmake/modules/HandleLLVMOptions.cmake 
b/llvm/cmake/modules/HandleLLVMOptions.cmake
index 78ed1c06ac9d..f313492ba978 100644
--- a/llvm/cmake/modules/HandleLLVMOptions.cmake
+++ b/llvm/cmake/modules/HandleLLVMOptions.cmake
@@ -917,6 +917,15 @@ if (LLVM_BUILD_INSTRUMENTED)
   CMAKE_EXE_LINKER_FLAGS
   CMAKE_SHARED_LINKER_FLAGS)
 endif()
+# Set this to avoid running out of the value profile node section
+# under clang in dynamic linking mode.
+if (CMAKE_CXX_COMPILER_ID MATCHES "Clang" AND
+CMAKE_CXX_COMPILER_VERSION VERSION_GREATER_EQUAL 11 AND
+LLVM_LINK_LLVM_DYLIB)
+  append("-Xclang -mllvm -Xclang -vp-counters-per-site=1.5"
+CMAKE_CXX_FLAGS
+CMAKE_C_FLAGS)
+endif()
   elseif(uppercase_LLVM_BUILD_INSTRUMENTED STREQUAL "CSIR")
 append("-fcs-profile-generate=\"${LLVM_CSPROFILE_DATA_DIR}\""
   CMAKE_CXX_FLAGS



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits