[llvm-branch-commits] [clang] 2f1aff8 - add -fpch-codegen/debuginfo mapping to -fmodules-codegen/debuginfo

2020-07-23 Thread Hans Wennborg via llvm-branch-commits

Author: Luboš Luňák
Date: 2020-07-23T14:54:03+02:00
New Revision: 2f1aff8325387b8ca1c9a1a14e2065827e9b1c15

URL: 
https://github.com/llvm/llvm-project/commit/2f1aff8325387b8ca1c9a1a14e2065827e9b1c15
DIFF: 
https://github.com/llvm/llvm-project/commit/2f1aff8325387b8ca1c9a1a14e2065827e9b1c15.diff

LOG: add -fpch-codegen/debuginfo mapping to -fmodules-codegen/debuginfo

Using -fmodules-* options for PCHs is a bit confusing, so add -fpch-*
variants. Having extra options also makes it simple to do a configure
check for the feature.
Also document the options in the release notes.

Differential Revision: https://reviews.llvm.org/D83623

(cherry picked from commit 54eea6127c4d77db03787b7c55765632fb9a6f1c)

Added: 


Modified: 
clang/docs/ReleaseNotes.rst
clang/include/clang/Driver/Options.td
clang/lib/Driver/ToolChains/Clang.cpp
clang/test/Driver/pch-codegen.cpp

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 8b27e663d9f8..3264846506c6 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -115,6 +115,32 @@ New Compiler Flags
   if the source header file is not self-contained. This option is enabled
   by default for clang-cl.
 
+- -fpch-codegen and -fpch-debuginfo generate shared code and/or debuginfo
+  for contents of a precompiled header in a separate object file. This object
+  file needs to be linked in, but its contents do not need to be generated
+  for other objects using the precompiled header. This should usually save
+  compile time. If not using clang-cl, the separate object file needs to
+  be created explicitly from the precompiled header.
+  Example of use:
+
+  .. code-block:: console
+
+$ clang++ -x c++-header header.h -o header.pch -fpch-codegen 
-fpch-debuginfo
+$ clang++ -c header.pch -o shared.o
+$ clang++ -c source.cpp -o source.o -include-pch header.pch
+$ clang++ -o binary source.o shared.o
+
+  - Using -fpch-instantiate-templates when generating the precompiled header
+usually increases the amount of code/debuginfo that can be shared.
+  - In some cases, especially when building with optimizations enabled, using
+-fpch-codegen may generate so much code in the shared object that compiling
+it may be a net loss in build time.
+  - Since headers may bring in private symbols of other libraries, it may be
+sometimes necessary to discard unused symbols (such as by adding
+-Wl,--gc-sections on ELF platforms to the linking command, and possibly
+adding -fdata-sections -ffunction-sections to the command generating
+the shared object).
+
 Deprecated Compiler Flags
 -
 

diff  --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index f4556c15d744..b20b8a288221 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1440,6 +1440,10 @@ def fpch_instantiate_templates:
 def fno_pch_instantiate_templates:
   Flag <["-"], "fno-pch-instantiate-templates">,
   Group, Flags<[CC1Option]>;
+defm pch_codegen: OptInFFlag<"pch-codegen", "Generate ", "Do not generate ",
+  "code for uses of this PCH that assumes an explicit object file will be 
built for the PCH">;
+defm pch_debuginfo: OptInFFlag<"pch-debuginfo", "Generate ", "Do not generate 
",
+  "debug info for types in an object file built from this PCH and do not 
generate them elsewhere">;
 
 def fmodules : Flag <["-"], "fmodules">, Group,
   Flags<[DriverOption, CC1Option]>,

diff  --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 9d6333bb5f1d..25fc837e803b 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -5627,6 +5627,12 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   if (Args.hasFlag(options::OPT_fpch_instantiate_templates,
options::OPT_fno_pch_instantiate_templates, false))
 CmdArgs.push_back("-fpch-instantiate-templates");
+  if (Args.hasFlag(options::OPT_fpch_codegen, options::OPT_fno_pch_codegen,
+   false))
+CmdArgs.push_back("-fmodules-codegen");
+  if (Args.hasFlag(options::OPT_fpch_debuginfo, options::OPT_fno_pch_debuginfo,
+   false))
+CmdArgs.push_back("-fmodules-debuginfo");
 
   Args.AddLastArg(CmdArgs, options::OPT_fexperimental_new_pass_manager,
   options::OPT_fno_experimental_new_pass_manager);

diff  --git a/clang/test/Driver/pch-codegen.cpp 
b/clang/test/Driver/pch-codegen.cpp
index 1b125107fb28..c6b6d9217e42 100644
--- a/clang/test/Driver/pch-codegen.cpp
+++ b/clang/test/Driver/pch-codegen.cpp
@@ -7,21 +7,21 @@
 // CHECK-PCH-CREATE-NOT: -fmodules-codegen
 // CHECK-PCH-CREATE-NOT: -fmodules-debuginfo
 
-// Create PCH with -fmodules-codegen.
-// RUN: %clang -x c++-header -Xclang -fmodules-codegen 
%S/../Modules/Inputs/codegen-flags/foo

[llvm-branch-commits] [clang] 3be0d86 - accept 'clang++ -c a.pch -o a.o' to create PCH's object file

2020-07-23 Thread Hans Wennborg via llvm-branch-commits

Author: Luboš Luňák
Date: 2020-07-23T14:54:03+02:00
New Revision: 3be0d8669f9a7e43cf909cdb29dc2cf087a4292d

URL: 
https://github.com/llvm/llvm-project/commit/3be0d8669f9a7e43cf909cdb29dc2cf087a4292d
DIFF: 
https://github.com/llvm/llvm-project/commit/3be0d8669f9a7e43cf909cdb29dc2cf087a4292d.diff

LOG: accept 'clang++ -c a.pch -o a.o' to create PCH's object file

This way should be the same like with a.pcm for modules.
An alternative way is 'clang++ -c empty.cpp -include-pch a.pch -o a.o
-Xclang -building-pch-with-obj', which is what clang-cl's /Yc does
internally.

Differential Revision: https://reviews.llvm.org/D83716

(cherry picked from commit 3895466e2c336c0797710ae35150ba1ce6bc0b96)

Added: 
clang/test/Driver/pch-codegen.cpp

Modified: 
clang/lib/Driver/Types.cpp
clang/lib/Frontend/CompilerInvocation.cpp
clang/test/PCH/codegen.cpp

Removed: 




diff  --git a/clang/lib/Driver/Types.cpp b/clang/lib/Driver/Types.cpp
index 399e26d8d64a..2050dffa6fa0 100644
--- a/clang/lib/Driver/Types.cpp
+++ b/clang/lib/Driver/Types.cpp
@@ -141,7 +141,7 @@ bool types::isAcceptedByClang(ID Id) {
   case TY_CXXHeader: case TY_PP_CXXHeader:
   case TY_ObjCXXHeader: case TY_PP_ObjCXXHeader:
   case TY_CXXModule: case TY_PP_CXXModule:
-  case TY_AST: case TY_ModuleFile:
+  case TY_AST: case TY_ModuleFile: case TY_PCH:
   case TY_LLVM_IR: case TY_LLVM_BC:
 return true;
   }

diff  --git a/clang/lib/Frontend/CompilerInvocation.cpp 
b/clang/lib/Frontend/CompilerInvocation.cpp
index 75d7cf5d26d3..73114c6d76cb 100644
--- a/clang/lib/Frontend/CompilerInvocation.cpp
+++ b/clang/lib/Frontend/CompilerInvocation.cpp
@@ -2022,8 +2022,9 @@ static InputKind ParseFrontendArgs(FrontendOptions &Opts, 
ArgList &Args,
 // FIXME: Supporting '-header-cpp-output' would be useful.
 bool Preprocessed = XValue.consume_back("-cpp-output");
 bool ModuleMap = XValue.consume_back("-module-map");
-IsHeaderFile =
-!Preprocessed && !ModuleMap && XValue.consume_back("-header");
+IsHeaderFile = !Preprocessed && !ModuleMap &&
+   XValue != "precompiled-header" &&
+   XValue.consume_back("-header");
 
 // Principal languages.
 DashX = llvm::StringSwitch(XValue)
@@ -2050,7 +2051,7 @@ static InputKind ParseFrontendArgs(FrontendOptions &Opts, 
ArgList &Args,
   DashX = llvm::StringSwitch(XValue)
   .Case("cpp-output", InputKind(Language::C).getPreprocessed())
   .Case("assembler-with-cpp", Language::Asm)
-  .Cases("ast", "pcm",
+  .Cases("ast", "pcm", "precompiled-header",
  InputKind(Language::Unknown, InputKind::Precompiled))
   .Case("ir", Language::LLVM_IR)
   .Default(Language::Unknown);

diff  --git a/clang/test/Driver/pch-codegen.cpp 
b/clang/test/Driver/pch-codegen.cpp
new file mode 100644
index ..1b125107fb28
--- /dev/null
+++ b/clang/test/Driver/pch-codegen.cpp
@@ -0,0 +1,38 @@
+// RUN: rm -rf %t
+// RUN: mkdir -p %t
+
+// Create PCH without codegen.
+// RUN: %clang -x c++-header %S/../Modules/Inputs/codegen-flags/foo.h -o 
%t/foo-cg.pch -### 2>&1 | FileCheck %s -check-prefix=CHECK-PCH-CREATE
+// CHECK-PCH-CREATE: -emit-pch
+// CHECK-PCH-CREATE-NOT: -fmodules-codegen
+// CHECK-PCH-CREATE-NOT: -fmodules-debuginfo
+
+// Create PCH with -fmodules-codegen.
+// RUN: %clang -x c++-header -Xclang -fmodules-codegen 
%S/../Modules/Inputs/codegen-flags/foo.h -o %t/foo-cg.pch -### 2>&1 | FileCheck 
%s -check-prefix=CHECK-PCH-CODEGEN-CREATE
+// CHECK-PCH-CODEGEN-CREATE: -emit-pch
+// CHECK-PCH-CODEGEN-CREATE: -fmodules-codegen
+// CHECK-PCH-CODEGEN-CREATE: "-x" "c++-header"
+// CHECK-PCH-CODEGEN-CREATE-NOT: -fmodules-debuginfo
+
+// Create PCH with -fmodules-debuginfo.
+// RUN: %clang -x c++-header -Xclang -fmodules-debuginfo 
%S/../Modules/Inputs/codegen-flags/foo.h -g -o %t/foo-di.pch -### 2>&1 | 
FileCheck %s -check-prefix=CHECK-PCH-DEBUGINFO-CREATE
+// CHECK-PCH-DEBUGINFO-CREATE: -emit-pch
+// CHECK-PCH-DEBUGINFO-CREATE: -fmodules-debuginfo
+// CHECK-PCH-DEBUGINFO-CREATE: "-x" "c++-header"
+// CHECK-PCH-DEBUGINFO-CREATE-NOT: -fmodules-codegen
+
+// Create PCH's object file for -fmodules-codegen.
+// RUN: touch %t/foo-cg.pch
+// RUN: %clang -c %t/foo-cg.pch -o %t/foo-cg.o -### 2>&1 | FileCheck %s 
-check-prefix=CHECK-PCH-CODEGEN-OBJ
+// CHECK-PCH-CODEGEN-OBJ: -emit-obj
+// CHECK-PCH-CODEGEN-OBJ: "-main-file-name" "foo-cg.pch"
+// CHECK-PCH-CODEGEN-OBJ: "-o" "{{.*}}foo-cg.o"
+// CHECK-PCH-CODEGEN-OBJ: "-x" "precompiled-header"
+
+// Create PCH's object file for -fmodules-debuginfo.
+// RUN: touch %t/foo-di.pch
+// RUN: %clang -c %t/foo-di.pch -g -o %t/foo-di.o -### 2>&1 | FileCheck %s 
-check-prefix=CHECK-PCH-DEBUGINFO-OBJ
+// CHECK-PCH-DEBUGINFO-OBJ: -emit-obj
+// CHECK-PCH-DEBUGINFO-OBJ: "-main-file-name" "foo-di.pch"
+// CHECK-PCH-DEBUGINFO-OBJ: "-o" "{{.*}

[llvm-branch-commits] [llvm] e9d37a2 - Drop the npm run line from llvm/test/Analysis/ScalarEvolution/pr46786.ll

2020-07-23 Thread Hans Wennborg via llvm-branch-commits

Author: Hans Wennborg
Date: 2020-07-23T15:11:38+02:00
New Revision: e9d37a2ee97f820bc65e2badf5142414495580e5

URL: 
https://github.com/llvm/llvm-project/commit/e9d37a2ee97f820bc65e2badf5142414495580e5
DIFF: 
https://github.com/llvm/llvm-project/commit/e9d37a2ee97f820bc65e2badf5142414495580e5.diff

LOG: Drop the npm run line from llvm/test/Analysis/ScalarEvolution/pr46786.ll
since it's failing.

Added: 


Modified: 
llvm/test/Analysis/ScalarEvolution/pr46786.ll

Removed: 




diff  --git a/llvm/test/Analysis/ScalarEvolution/pr46786.ll 
b/llvm/test/Analysis/ScalarEvolution/pr46786.ll
index 21a65702b3a3..17110679c88e 100644
--- a/llvm/test/Analysis/ScalarEvolution/pr46786.ll
+++ b/llvm/test/Analysis/ScalarEvolution/pr46786.ll
@@ -1,6 +1,5 @@
 ; NOTE: Assertions have been autogenerated by 
utils/update_analyze_test_checks.py
 ; RUN: opt < %s -analyze -enable-new-pm=0 -scalar-evolution | FileCheck %s
-; RUN: opt < %s -disable-output "-passes=print" 2>&1 | 
FileCheck %s
 
 source_filename = "input.cpp"
 target datalayout = "e-m:e-p:32:32-i64:64-n32:64-S128"



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 9c48156 - [SCEV] Remove premature assert. PR46786

2020-07-23 Thread Hans Wennborg via llvm-branch-commits

Author: Max Kazantsev
Date: 2020-07-23T15:02:00+02:00
New Revision: 9c48156c25f96cbd3b3405a53f681fee521514fa

URL: 
https://github.com/llvm/llvm-project/commit/9c48156c25f96cbd3b3405a53f681fee521514fa
DIFF: 
https://github.com/llvm/llvm-project/commit/9c48156c25f96cbd3b3405a53f681fee521514fa.diff

LOG: [SCEV] Remove premature assert. PR46786

This assert was added to verify assumption that GEP's SCEV will be of pointer 
type,
basing on fact that it should be a SCEVAddExpr with (at least) last operand 
being
pointer. Two notes:
- GEP's SCEV does not have to be a SCEVAddExpr after all simplifications;
- In current state, GEP's SCEV does not have to have at least one pointer 
operands
  (all of them can become int during the transforms).

However, we might want to be at a point where it is true. We are currently 
removing
this assert and will try to enumerate the cases where "is pointer" notion might 
be
lost during the transforms. When all of them are fixed, we can return it.

Differential Revision: https://reviews.llvm.org/D84294
Reviewed By: lebedev.ri

(cherry picked from commit b96114c1e1fc4448ea966bce013706359aee3fa9)

Added: 
llvm/test/Analysis/ScalarEvolution/pr46786.ll

Modified: 
llvm/lib/Analysis/ScalarEvolution.cpp

Removed: 




diff  --git a/llvm/lib/Analysis/ScalarEvolution.cpp 
b/llvm/lib/Analysis/ScalarEvolution.cpp
index 48c686b73260..3c96b3f20461 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -3317,10 +3317,7 @@ ScalarEvolution::getGEPExpr(GEPOperator *GEP,
   }
 
   // Add the total offset from all the GEP indices to the base.
-  auto *GEPExpr = getAddExpr(BaseExpr, TotalOffset, Wrap);
-  assert(BaseExpr->getType() == GEPExpr->getType() &&
- "GEP should not change type mid-flight.");
-  return GEPExpr;
+  return getAddExpr(BaseExpr, TotalOffset, Wrap);
 }
 
 std::tuple

diff  --git a/llvm/test/Analysis/ScalarEvolution/pr46786.ll 
b/llvm/test/Analysis/ScalarEvolution/pr46786.ll
new file mode 100644
index ..21a65702b3a3
--- /dev/null
+++ b/llvm/test/Analysis/ScalarEvolution/pr46786.ll
@@ -0,0 +1,37 @@
+; NOTE: Assertions have been autogenerated by 
utils/update_analyze_test_checks.py
+; RUN: opt < %s -analyze -enable-new-pm=0 -scalar-evolution | FileCheck %s
+; RUN: opt < %s -disable-output "-passes=print" 2>&1 | 
FileCheck %s
+
+source_filename = "input.cpp"
+target datalayout = "e-m:e-p:32:32-i64:64-n32:64-S128"
+
+; Function Attrs: nofree
+define i8* @FSE_decompress_usingDTable(i8* %arg, i32 %arg1, i32 %arg2, i32 
%arg3) local_unnamed_addr #0 {
+; CHECK-LABEL: 'FSE_decompress_usingDTable'
+; CHECK-NEXT:  Classifying expressions for: @FSE_decompress_usingDTable
+; CHECK-NEXT:%i = getelementptr inbounds i8, i8* %arg, i32 %arg2
+; CHECK-NEXT:--> (%arg2 + %arg) U: full-set S: full-set
+; CHECK-NEXT:%i4 = sub nsw i32 0, %arg1
+; CHECK-NEXT:--> (-1 * %arg1) U: full-set S: full-set
+; CHECK-NEXT:%i5 = getelementptr inbounds i8, i8* %i, i32 %i4
+; CHECK-NEXT:--> ((-1 * %arg1) + %arg2 + %arg) U: full-set S: full-set
+; CHECK-NEXT:%i7 = select i1 %i6, i32 %arg2, i32 %arg1
+; CHECK-NEXT:--> ((-1 * %arg) + (((-1 * %arg1) + %arg2 + %arg) umin %arg) 
+ %arg1) U: full-set S: full-set
+; CHECK-NEXT:%i8 = sub i32 %arg3, %i7
+; CHECK-NEXT:--> ((-1 * (((-1 * %arg1) + %arg2 + %arg) umin %arg)) + (-1 * 
%arg1) + %arg3 + %arg) U: full-set S: full-set
+; CHECK-NEXT:%i9 = getelementptr inbounds i8, i8* %arg, i32 %i8
+; CHECK-NEXT:--> ((2 * %arg) + (-1 * (((-1 * %arg1) + %arg2 + %arg) umin 
%arg)) + (-1 * %arg1) + %arg3) U: full-set S: full-set
+; CHECK-NEXT:  Determining loop execution counts for: 
@FSE_decompress_usingDTable
+;
+bb:
+  %i = getelementptr inbounds i8, i8* %arg, i32 %arg2
+  %i4 = sub nsw i32 0, %arg1
+  %i5 = getelementptr inbounds i8, i8* %i, i32 %i4
+  %i6 = icmp ult i8* %i5, %arg
+  %i7 = select i1 %i6, i32 %arg2, i32 %arg1
+  %i8 = sub i32 %arg3, %i7
+  %i9 = getelementptr inbounds i8, i8* %arg, i32 %i8
+  ret i8* %i9
+}
+
+attributes #0 = { nofree }



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 826f730 - [InstCombine] Add test for PR46680 (NFC)

2020-07-23 Thread Hans Wennborg via llvm-branch-commits

Author: Nikita Popov
Date: 2020-07-23T15:13:59+02:00
New Revision: 826f730f3f1e2722059fe9d7f271a27a0d980a0f

URL: 
https://github.com/llvm/llvm-project/commit/826f730f3f1e2722059fe9d7f271a27a0d980a0f
DIFF: 
https://github.com/llvm/llvm-project/commit/826f730f3f1e2722059fe9d7f271a27a0d980a0f.diff

LOG: [InstCombine] Add test for PR46680 (NFC)

(cherry picked from commit 13ae440de4a408cf9d1a448def09769ecbecfdf7)

Added: 
llvm/test/Transforms/InstCombine/pr46680.ll

Modified: 


Removed: 




diff  --git a/llvm/test/Transforms/InstCombine/pr46680.ll 
b/llvm/test/Transforms/InstCombine/pr46680.ll
new file mode 100644
index ..90ea2e110afe
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/pr46680.ll
@@ -0,0 +1,92 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -S -instcombine -instcombine-infinite-loop-threshold=3 < %s | 
FileCheck %s
+
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-pc-linux-gnu"
+
+@a = dso_local local_unnamed_addr global i64 0, align 8
+@d = dso_local local_unnamed_addr global i64 0, align 8
+@c = external dso_local local_unnamed_addr global i8, align 1
+
+define void @test(i16* nocapture readonly %arg) local_unnamed_addr {
+; CHECK-LABEL: @test(
+; CHECK-NEXT:  bb:
+; CHECK-NEXT:[[I:%.*]] = load i64, i64* @d, align 8
+; CHECK-NEXT:[[I1:%.*]] = icmp eq i64 [[I]], 0
+; CHECK-NEXT:[[I2:%.*]] = load i64, i64* @a, align 8
+; CHECK-NEXT:[[I3:%.*]] = icmp ne i64 [[I2]], 0
+; CHECK-NEXT:br i1 [[I1]], label [[BB13:%.*]], label [[BB4:%.*]]
+; CHECK:   bb4:
+; CHECK-NEXT:[[I5:%.*]] = load i16, i16* [[ARG:%.*]], align 2
+; CHECK-NEXT:[[I6:%.*]] = trunc i16 [[I5]] to i8
+; CHECK-NEXT:store i8 [[I6]], i8* @c, align 1
+; CHECK-NEXT:tail call void @llvm.assume(i1 [[I3]])
+; CHECK-NEXT:br label [[BB22:%.*]]
+; CHECK:   bb13:
+; CHECK-NEXT:[[I14:%.*]] = load i16, i16* [[ARG]], align 2
+; CHECK-NEXT:[[I15:%.*]] = trunc i16 [[I14]] to i8
+; CHECK-NEXT:store i8 [[I15]], i8* @c, align 1
+; CHECK-NEXT:br label [[BB22]]
+; CHECK:   bb22:
+; CHECK-NEXT:[[STOREMERGE2_IN:%.*]] = load i16, i16* [[ARG]], align 2
+; CHECK-NEXT:[[STOREMERGE2:%.*]] = trunc i16 [[STOREMERGE2_IN]] to i8
+; CHECK-NEXT:store i8 [[STOREMERGE2]], i8* @c, align 1
+; CHECK-NEXT:[[STOREMERGE1_IN:%.*]] = load i16, i16* [[ARG]], align 2
+; CHECK-NEXT:[[STOREMERGE1:%.*]] = trunc i16 [[STOREMERGE1_IN]] to i8
+; CHECK-NEXT:store i8 [[STOREMERGE1]], i8* @c, align 1
+; CHECK-NEXT:[[STOREMERGE_IN:%.*]] = load i16, i16* [[ARG]], align 2
+; CHECK-NEXT:[[STOREMERGE:%.*]] = trunc i16 [[STOREMERGE_IN]] to i8
+; CHECK-NEXT:store i8 [[STOREMERGE]], i8* @c, align 1
+; CHECK-NEXT:br label [[BB23:%.*]]
+; CHECK:   bb23:
+; CHECK-NEXT:br label [[BB23]]
+;
+bb:
+  %i = load i64, i64* @d, align 8
+  %i1 = icmp eq i64 %i, 0
+  %i2 = load i64, i64* @a, align 8
+  %i3 = icmp ne i64 %i2, 0
+  br i1 %i1, label %bb13, label %bb4
+
+bb4:  ; preds = %bb
+  %i5 = load i16, i16* %arg, align 2
+  %i6 = trunc i16 %i5 to i8
+  store i8 %i6, i8* @c, align 1
+  tail call void @llvm.assume(i1 %i3)
+  %i7 = load i16, i16* %arg, align 2
+  %i8 = trunc i16 %i7 to i8
+  store i8 %i8, i8* @c, align 1
+  %i9 = load i16, i16* %arg, align 2
+  %i10 = trunc i16 %i9 to i8
+  store i8 %i10, i8* @c, align 1
+  %i11 = load i16, i16* %arg, align 2
+  %i12 = trunc i16 %i11 to i8
+  store i8 %i12, i8* @c, align 1
+  br label %bb22
+
+bb13: ; preds = %bb
+  %i14 = load i16, i16* %arg, align 2
+  %i15 = trunc i16 %i14 to i8
+  store i8 %i15, i8* @c, align 1
+  %i16 = load i16, i16* %arg, align 2
+  %i17 = trunc i16 %i16 to i8
+  store i8 %i17, i8* @c, align 1
+  %i18 = load i16, i16* %arg, align 2
+  %i19 = trunc i16 %i18 to i8
+  store i8 %i19, i8* @c, align 1
+  %i20 = load i16, i16* %arg, align 2
+  %i21 = trunc i16 %i20 to i8
+  store i8 %i21, i8* @c, align 1
+  br label %bb22
+
+bb22: ; preds = %bb13, %bb4
+  br label %bb23
+
+bb23: ; preds = %bb23, %bb22
+  br label %bb23
+}
+
+; Function Attrs: nounwind willreturn
+declare void @llvm.assume(i1) #0
+
+attributes #0 = { nounwind willreturn }



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] eb3c5db - [InstCombine] Fix store merge worklist management (PR46680)

2020-07-23 Thread Hans Wennborg via llvm-branch-commits

Author: Nikita Popov
Date: 2020-07-23T15:13:59+02:00
New Revision: eb3c5db40a1450d50c387f3a42f4c095001220cb

URL: 
https://github.com/llvm/llvm-project/commit/eb3c5db40a1450d50c387f3a42f4c095001220cb
DIFF: 
https://github.com/llvm/llvm-project/commit/eb3c5db40a1450d50c387f3a42f4c095001220cb.diff

LOG: [InstCombine] Fix store merge worklist management (PR46680)

Fixes https://bugs.llvm.org/show_bug.cgi?id=46680.

Just like insertions through IRBuilder, InsertNewInstBefore()
should be using the deferred worklist mechanism, so that processing
of newly added instructions is prioritized.

There's one side-effect of the worklist order change which could be
classified as a regression. An add op gets pushed through a select
that at the time is not a umax. We could add a reverse transform
that tries to push adds in the reverse direction to restore a min/max,
but that seems like a sure way of getting infinite loops... Seems
like something that should best wait on min/max intrinsics.

Differential Revision: https://reviews.llvm.org/D84109

(cherry picked from commit d12ec0f752e7f2c7f7252539da2d124264ec33f7)

Added: 


Modified: 
llvm/lib/Transforms/InstCombine/InstCombineInternal.h
llvm/test/Transforms/InstCombine/minmax-fold.ll
llvm/test/Transforms/InstCombine/pr46680.ll

Removed: 




diff  --git a/llvm/lib/Transforms/InstCombine/InstCombineInternal.h 
b/llvm/lib/Transforms/InstCombine/InstCombineInternal.h
index f918dc7198ca..ca51f37af4d9 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineInternal.h
+++ b/llvm/lib/Transforms/InstCombine/InstCombineInternal.h
@@ -653,7 +653,7 @@ class LLVM_LIBRARY_VISIBILITY InstCombiner
"New instruction already inserted into a basic block!");
 BasicBlock *BB = Old.getParent();
 BB->getInstList().insert(Old.getIterator(), New); // Insert inst
-Worklist.push(New);
+Worklist.add(New);
 return New;
   }
 

diff  --git a/llvm/test/Transforms/InstCombine/minmax-fold.ll 
b/llvm/test/Transforms/InstCombine/minmax-fold.ll
index 5ee38978ed78..dcf060c09613 100644
--- a/llvm/test/Transforms/InstCombine/minmax-fold.ll
+++ b/llvm/test/Transforms/InstCombine/minmax-fold.ll
@@ -953,8 +953,8 @@ define i32 @add_umin(i32 %x) {
 
 define i32 @add_umin_constant_limit(i32 %x) {
 ; CHECK-LABEL: @add_umin_constant_limit(
-; CHECK-NEXT:[[TMP1:%.*]] = icmp eq i32 [[X:%.*]], 0
-; CHECK-NEXT:[[R:%.*]] = select i1 [[TMP1]], i32 41, i32 42
+; CHECK-NEXT:[[DOTNOT:%.*]] = icmp eq i32 [[X:%.*]], 0
+; CHECK-NEXT:[[R:%.*]] = select i1 [[DOTNOT]], i32 41, i32 42
 ; CHECK-NEXT:ret i32 [[R]]
 ;
   %a = add nuw i32 %x, 41
@@ -1165,8 +1165,8 @@ define <2 x i33> @add_umax_vec(<2 x i33> %x) {
 
 define i8 @PR14613_umin(i8 %x) {
 ; CHECK-LABEL: @PR14613_umin(
-; CHECK-NEXT:[[U7:%.*]] = call i8 @llvm.uadd.sat.i8(i8 [[X:%.*]], i8 15)
-; CHECK-NEXT:ret i8 [[U7]]
+; CHECK-NEXT:[[TMP1:%.*]] = call i8 @llvm.uadd.sat.i8(i8 [[X:%.*]], i8 15)
+; CHECK-NEXT:ret i8 [[TMP1]]
 ;
   %u4 = zext i8 %x to i32
   %u5 = add nuw nsw i32 %u4, 15
@@ -1179,8 +1179,8 @@ define i8 @PR14613_umin(i8 %x) {
 define i8 @PR14613_umax(i8 %x) {
 ; CHECK-LABEL: @PR14613_umax(
 ; CHECK-NEXT:[[TMP1:%.*]] = icmp ugt i8 [[X:%.*]], -16
-; CHECK-NEXT:[[TMP2:%.*]] = select i1 [[TMP1]], i8 [[X]], i8 -16
-; CHECK-NEXT:[[U7:%.*]] = add nsw i8 [[TMP2]], 15
+; CHECK-NEXT:[[X_OP:%.*]] = add i8 [[X]], 15
+; CHECK-NEXT:[[U7:%.*]] = select i1 [[TMP1]], i8 [[X_OP]], i8 -1
 ; CHECK-NEXT:ret i8 [[U7]]
 ;
   %u4 = zext i8 %x to i32
@@ -1422,8 +1422,8 @@ define <2 x i33> @add_smax_vec(<2 x i33> %x) {
 define i8 @PR14613_smin(i8 %x) {
 ; CHECK-LABEL: @PR14613_smin(
 ; CHECK-NEXT:[[TMP1:%.*]] = icmp slt i8 [[X:%.*]], 40
-; CHECK-NEXT:[[TMP2:%.*]] = select i1 [[TMP1]], i8 [[X]], i8 40
-; CHECK-NEXT:[[U7:%.*]] = add nsw i8 [[TMP2]], 15
+; CHECK-NEXT:[[X_OP:%.*]] = add i8 [[X]], 15
+; CHECK-NEXT:[[U7:%.*]] = select i1 [[TMP1]], i8 [[X_OP]], i8 55
 ; CHECK-NEXT:ret i8 [[U7]]
 ;
   %u4 = sext i8 %x to i32
@@ -1437,8 +1437,8 @@ define i8 @PR14613_smin(i8 %x) {
 define i8 @PR14613_smax(i8 %x) {
 ; CHECK-LABEL: @PR14613_smax(
 ; CHECK-NEXT:[[TMP1:%.*]] = icmp sgt i8 [[X:%.*]], 40
-; CHECK-NEXT:[[TMP2:%.*]] = select i1 [[TMP1]], i8 [[X]], i8 40
-; CHECK-NEXT:[[U7:%.*]] = add nuw i8 [[TMP2]], 15
+; CHECK-NEXT:[[X_OP:%.*]] = add i8 [[X]], 15
+; CHECK-NEXT:[[U7:%.*]] = select i1 [[TMP1]], i8 [[X_OP]], i8 55
 ; CHECK-NEXT:ret i8 [[U7]]
 ;
   %u4 = sext i8 %x to i32

diff  --git a/llvm/test/Transforms/InstCombine/pr46680.ll 
b/llvm/test/Transforms/InstCombine/pr46680.ll
index 90ea2e110afe..59d449d5dc23 100644
--- a/llvm/test/Transforms/InstCombine/pr46680.ll
+++ b/llvm/test/Transforms/InstCombine/pr46680.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt -S -instcombine -instcombine-infinite-loop-

[llvm-branch-commits] [llvm] 8a2bc94 - [X86][AVX] getTargetShuffleMask - don't decode VBROADCAST(EXTRACT_SUBVECTOR(X, 0)) patterns.

2020-07-23 Thread Hans Wennborg via llvm-branch-commits

Author: Simon Pilgrim
Date: 2020-07-23T15:19:22+02:00
New Revision: 8a2bc9431193026454745d538cf7e5a5a6b6d5be

URL: 
https://github.com/llvm/llvm-project/commit/8a2bc9431193026454745d538cf7e5a5a6b6d5be
DIFF: 
https://github.com/llvm/llvm-project/commit/8a2bc9431193026454745d538cf7e5a5a6b6d5be.diff

LOG: [X86][AVX] getTargetShuffleMask - don't decode 
VBROADCAST(EXTRACT_SUBVECTOR(X,0)) patterns.

getTargetShuffleMask is used by the various "SimplifyDemanded" folds so we 
can't assume that the bypassed extract_subvector can be safely simplified - 
getFauxShuffleMask performs a more general decode that allows us to more safely 
catch many of these cases so the impact is minimal.

(cherry picked from commit 5b5dc2442ac7a574a3b7d17c15ebeeb9eb3bec26)

Added: 


Modified: 
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/vector-fshl-256.ll
llvm/test/CodeGen/X86/vector-fshl-512.ll
llvm/test/CodeGen/X86/vector-fshl-rot-512.ll
llvm/test/CodeGen/X86/vector-fshr-256.ll
llvm/test/CodeGen/X86/vector-fshr-512.ll
llvm/test/CodeGen/X86/vector-fshr-rot-512.ll
llvm/test/CodeGen/X86/vector-rotate-512.ll

Removed: 




diff  --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index ea4b4734225d..f8b6b7eb3aff 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -6916,25 +6916,16 @@ static bool getTargetShuffleMask(SDNode *N, MVT VT, 
bool AllowSentinelZero,
 DecodeZeroMoveLowMask(NumElems, Mask);
 IsUnary = true;
 break;
-  case X86ISD::VBROADCAST: {
-SDValue N0 = N->getOperand(0);
-// See if we're broadcasting from index 0 of an EXTRACT_SUBVECTOR. If so,
-// add the pre-extracted value to the Ops vector.
-if (N0.getOpcode() == ISD::EXTRACT_SUBVECTOR &&
-N0.getOperand(0).getValueType() == VT &&
-N0.getConstantOperandVal(1) == 0)
-  Ops.push_back(N0.getOperand(0));
-
-// We only decode broadcasts of same-sized vectors, unless the broadcast
-// came from an extract from the original width. If we found one, we
-// pushed it the Ops vector above.
-if (N0.getValueType() == VT || !Ops.empty()) {
+  case X86ISD::VBROADCAST:
+// We only decode broadcasts of same-sized vectors, peeking through to
+// extracted subvectors is likely to cause hasOneUse issues with
+// SimplifyDemandedBits etc.
+if (N->getOperand(0).getValueType() == VT) {
   DecodeVectorBroadcast(NumElems, Mask);
   IsUnary = true;
   break;
 }
 return false;
-  }
   case X86ISD::VPERMILPV: {
 assert(N->getOperand(0).getValueType() == VT && "Unexpected value type");
 IsUnary = true;

diff  --git a/llvm/test/CodeGen/X86/vector-fshl-256.ll 
b/llvm/test/CodeGen/X86/vector-fshl-256.ll
index 12feea765898..0688107ed5c0 100644
--- a/llvm/test/CodeGen/X86/vector-fshl-256.ll
+++ b/llvm/test/CodeGen/X86/vector-fshl-256.ll
@@ -1092,12 +1092,12 @@ define <8 x i32> @splatvar_funnnel_v8i32(<8 x i32> %x, 
<8 x i32> %y, <8 x i32> %
 ; AVX2-NEXT:vpand %xmm3, %xmm2, %xmm2
 ; AVX2-NEXT:vpmovzxdq {{.*#+}} xmm3 = xmm2[0],zero,xmm2[1],zero
 ; AVX2-NEXT:vpslld %xmm3, %ymm0, %ymm3
+; AVX2-NEXT:vpbroadcastd %xmm2, %ymm2
 ; AVX2-NEXT:vpbroadcastd {{.*#+}} xmm4 = [32,32,32,32]
 ; AVX2-NEXT:vpsubd %xmm2, %xmm4, %xmm4
 ; AVX2-NEXT:vpmovzxdq {{.*#+}} xmm4 = xmm4[0],zero,xmm4[1],zero
 ; AVX2-NEXT:vpsrld %xmm4, %ymm1, %ymm1
 ; AVX2-NEXT:vpor %ymm1, %ymm3, %ymm1
-; AVX2-NEXT:vpbroadcastd %xmm2, %ymm2
 ; AVX2-NEXT:vpxor %xmm3, %xmm3, %xmm3
 ; AVX2-NEXT:vpcmpeqd %ymm3, %ymm2, %ymm2
 ; AVX2-NEXT:vblendvps %ymm2, %ymm0, %ymm1, %ymm0
@@ -1110,12 +1110,12 @@ define <8 x i32> @splatvar_funnnel_v8i32(<8 x i32> %x, 
<8 x i32> %y, <8 x i32> %
 ; AVX512F-NEXT:vpand %xmm3, %xmm2, %xmm2
 ; AVX512F-NEXT:vpmovzxdq {{.*#+}} xmm3 = xmm2[0],zero,xmm2[1],zero
 ; AVX512F-NEXT:vpslld %xmm3, %ymm0, %ymm3
+; AVX512F-NEXT:vpbroadcastd %xmm2, %ymm2
 ; AVX512F-NEXT:vpbroadcastd {{.*#+}} xmm4 = [32,32,32,32]
 ; AVX512F-NEXT:vpsubd %xmm2, %xmm4, %xmm4
 ; AVX512F-NEXT:vpmovzxdq {{.*#+}} xmm4 = xmm4[0],zero,xmm4[1],zero
 ; AVX512F-NEXT:vpsrld %xmm4, %ymm1, %ymm1
 ; AVX512F-NEXT:vpor %ymm1, %ymm3, %ymm1
-; AVX512F-NEXT:vpbroadcastd %xmm2, %ymm2
 ; AVX512F-NEXT:vptestnmd %zmm2, %zmm2, %k1
 ; AVX512F-NEXT:vmovdqa32 %zmm0, %zmm1 {%k1}
 ; AVX512F-NEXT:vmovdqa %ymm1, %ymm0
@@ -1126,12 +1126,12 @@ define <8 x i32> @splatvar_funnnel_v8i32(<8 x i32> %x, 
<8 x i32> %y, <8 x i32> %
 ; AVX512VL-NEXT:vpandd {{.*}}(%rip){1to4}, %xmm2, %xmm2
 ; AVX512VL-NEXT:vpmovzxdq {{.*#+}} xmm3 = xmm2[0],zero,xmm2[1],zero
 ; AVX512VL-NEXT:vpslld %xmm3, %ymm0, %ymm3
+; AVX512VL-NEXT:vpbroadcastd %xmm2, %ymm2
 ; AVX512VL-NEXT:vpbroadcastd {{.*#+}} xmm4 = [32,32,32,32]
 ; AVX512VL-NEXT:vpsubd %xmm2, %xmm4, %xmm4
 ; A