[llvm-branch-commits] [clang] dcdb124 - Add release notes for things relating to MinGW in the release

2021-08-16 Thread Martin Storsjö via llvm-branch-commits

Author: Martin Storsjö
Date: 2021-08-16T12:26:49+03:00
New Revision: dcdb12496c24a02874dc060efac68adf178284cc

URL: 
https://github.com/llvm/llvm-project/commit/dcdb12496c24a02874dc060efac68adf178284cc
DIFF: 
https://github.com/llvm/llvm-project/commit/dcdb12496c24a02874dc060efac68adf178284cc.diff

LOG: Add release notes for things relating to MinGW in the release

Added: 


Modified: 
clang/docs/ReleaseNotes.rst
lld/docs/ReleaseNotes.rst
llvm/docs/ReleaseNotes.rst

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index a6f43bcaa4bb..285e057d92dd 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -123,6 +123,9 @@ Attribute Changes in Clang
 Windows Support
 ---
 
+- Fixed reading ``long double`` arguments with ``va_arg`` on x86_64 MinGW
+  targets.
+
 C Language Changes in Clang
 ---
 

diff  --git a/lld/docs/ReleaseNotes.rst b/lld/docs/ReleaseNotes.rst
index a52ee4348f78..bf1c36f38f14 100644
--- a/lld/docs/ReleaseNotes.rst
+++ b/lld/docs/ReleaseNotes.rst
@@ -42,12 +42,29 @@ Breaking changes
 COFF Improvements
 -
 
-* ...
+* Avoid thread exhaustion when running on 32 bit Windows.
+  (`D105506 `_)
+
+* Improve terminating the process on Windows while a thread pool might be
+  running. (`D102944 `_)
 
 MinGW Improvements
 --
 
-* ...
+* Support for linking directly against a DLL without using an import library
+  has been added. (`D104530 `_ and
+  `D104531 `_)
+
+* Fix linking with ``--export-all-symbols`` in combination with
+  ``-function-sections``. (`D101522 `_ and
+  `D101615 `_)
+
+* Fix automatic export of symbols from LTO objects.
+  (`D101569 `_)
+
+* Accept more spellings of some options.
+  (`D107237 `_ and
+  `D107253 `_)
 
 MachO Improvements
 --

diff  --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index de442580a149..9e9a7722b5e0 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -80,10 +80,15 @@ Changes to the AArch64 Backend
 * Introduced assembly support for Armv9-A's Realm Management Extension (RME)
   and Scalable Matrix Extension (SME).
 
+* Produce proper cross-section relative relocations on COFF
+
+* Fixed the calling convention on Windows for variadic functions involving
+  floats in the fixed arguments
+
 Changes to the ARM Backend
 --
 
-During this release ...
+* Produce proper cross-section relative relocations on COFF
 
 Changes to the MIPS Target
 --
@@ -241,6 +246,15 @@ Changes to the LLVM tools
 * In lli the default JIT engine switched from MCJIT (``-jit-kind=mcjit``) to 
ORC (``-jit-kind=orc``).
   (`D98931 `_)
 
+* llvm-rc got support for invoking Clang to preprocess its input.
+  (`D100755 `_)
+
+* llvm-rc got a GNU windres compatible frontend, llvm-windres.
+  (`D100756 `_)
+
+* llvm-ml has improved compatibility with MS ml.exe, managing to assemble
+  more asm files.
+
 Changes to LLDB
 -
 



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] a19747e - Fix type in DenseMap to match V.size()

2021-08-16 Thread Renato Golin via llvm-branch-commits

Author: Renato Golin
Date: 2021-08-16T15:32:04+01:00
New Revision: a19747ea7395dd470345e4703f13bbb74647b019

URL: 
https://github.com/llvm/llvm-project/commit/a19747ea7395dd470345e4703f13bbb74647b019
DIFF: 
https://github.com/llvm/llvm-project/commit/a19747ea7395dd470345e4703f13bbb74647b019.diff

LOG: Fix type in DenseMap to match V.size()

Differential Revision: https://reviews.llvm.org/D108124

Added: 


Modified: 
llvm/include/llvm/ADT/SmallBitVector.h

Removed: 




diff  --git a/llvm/include/llvm/ADT/SmallBitVector.h 
b/llvm/include/llvm/ADT/SmallBitVector.h
index f570bac23ad51..c70bc88fb1f24 100644
--- a/llvm/include/llvm/ADT/SmallBitVector.h
+++ b/llvm/include/llvm/ADT/SmallBitVector.h
@@ -721,7 +721,7 @@ template <> struct DenseMapInfo {
   }
   static unsigned getHashValue(const SmallBitVector &V) {
 uintptr_t Store;
-return DenseMapInfo>>::getHashValue(
+return DenseMapInfo>>::getHashValue(
 std::make_pair(V.size(), V.getData(Store)));
   }
   static bool isEqual(const SmallBitVector &LHS, const SmallBitVector &RHS) {



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] 1bbe8ef - [lld-macho] Fill out release notes for 13.x

2021-08-16 Thread Jez Ng via llvm-branch-commits

Author: Jez Ng
Date: 2021-08-16T13:19:00-04:00
New Revision: 1bbe8ef81549b37d2c74cfb607cffe0e01ab4dd3

URL: 
https://github.com/llvm/llvm-project/commit/1bbe8ef81549b37d2c74cfb607cffe0e01ab4dd3
DIFF: 
https://github.com/llvm/llvm-project/commit/1bbe8ef81549b37d2c74cfb607cffe0e01ab4dd3.diff

LOG: [lld-macho] Fill out release notes for 13.x

I probably missed out some things, given how much work was done in the
last few months...

Reviewed By: #lld-macho, oontvoo

Differential Revision: https://reviews.llvm.org/D107922

Added: 


Modified: 
lld/docs/ReleaseNotes.rst

Removed: 




diff  --git a/lld/docs/ReleaseNotes.rst b/lld/docs/ReleaseNotes.rst
index bf1c36f38f14..9ae375523518 100644
--- a/lld/docs/ReleaseNotes.rst
+++ b/lld/docs/ReleaseNotes.rst
@@ -66,10 +66,51 @@ MinGW Improvements
   (`D107237 `_ and
   `D107253 `_)
 
-MachO Improvements
---
-
-* Item 1.
+Mach-O Improvements
+---
+
+The Mach-O backend is now able to link several large, real-world programs,
+though we are still working out the kinks.
+
+* arm64 is now supported as a target. (`D88629 
`_)
+* arm64_32 is now supported as a target. (`D99822 
`_)
+* Branch-range-extension thunks are now supported. (`D100818 
`_)
+* ``-dead_strip`` is now supported. (`D103324 
`_)
+* Support for identical code folding (``--icf=all``) has been added.
+  (`D103292 `_)
+* Support for special ``$start`` and ``$end`` symbols for segment & sections 
has been
+  added. (`D106767 `_, `D106629 
`_)
+* ``$ld$previous`` symbols are now supported. (`D103505 
`_)
+* ``$ld$install_name`` symbols are now supported. (`D103746 
`_)
+* ``__mh_*_header`` symbols are now supported. (`D97007 
`_)
+* LC_CODE_SIGNATURE is now supported. (`D96164 
`_)
+* LC_FUNCTION_STARTS is now supported. (`D97260 
`_)
+* LC_DATA_IN_CODE is now supported. (`D103006 
`_)
+* Bind opcodes are more compactly encoded. (`D106128 
`_,
+  `D105075 `_)
+* LTO cache support has been added. (`D105922 
`_)
+* ``-application_extension`` is now supported. (`D105818 
`_)
+* ``-export_dynamic`` is now partially supported. (`D105482 
`_)
+* ``-arch_multiple`` is now supported. (`D105450 
`_)
+* ``-final_output`` is now supported. (`D105449 
`_)
+* ``-umbrella`` is now supported. (`D105448 
`_)
+* ``--print-dylib-search`` is now supported. (`D103985 
`_)
+* ``-force_load_swift_libs`` is now supported. (`D103709 
`_)
+* ``-reexport_framework``, ``-reexport_library``, ``-reexport-l`` are now 
supported.
+  (`D103497 `_)
+* ``.weak_def_can_be_hidden`` is now supported. (`D101080 
`_)
+* ``-add_ast_path`` is now supported. (`D100076 
`_)
+* ``-segprot`` is now supported.  (`D99389 `_)
+* ``-dependency_info`` is now partially supported. (`D98559 
`_)
+* ``--time-trace`` is now supported. (`D98419 
`_)
+* ``-mark_dead_strippable_dylib`` is now supported. (`D98262 
`_)
+* ``-[un]exported_symbol[s_list]`` is now supported. (`D98223 
`_)
+* ``-flat_namespace`` is now supported. (`D97641 
`_)
+* ``-rename_section`` and ``-rename_segment`` are now supported. (`D97600 
`_)
+* ``-bundle_loader`` is now supported. (`D95913 
`_)
+* ``-map`` is now partially supported. (`D98323 
`_)
+
+There were numerous other bug-fixes as well.
 
 WebAssembly Improvements
 



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 740f082 - [NFC] Clean up tests in test/Transforms/LoopVectorize/assume.ll

2021-08-16 Thread Tom Stellard via llvm-branch-commits

Author: David Sherwood
Date: 2021-08-16T11:32:33-07:00
New Revision: 740f08210e5dda848069f3175dcc2d12328c36ad

URL: 
https://github.com/llvm/llvm-project/commit/740f08210e5dda848069f3175dcc2d12328c36ad
DIFF: 
https://github.com/llvm/llvm-project/commit/740f08210e5dda848069f3175dcc2d12328c36ad.diff

LOG: [NFC] Clean up tests in test/Transforms/LoopVectorize/assume.ll

The tests previously had lots of unnecessary CHECK lines, where
all we really need to check is the presence (or absence) of the
assume intrinsic and the correct input operands.

Differential Revision: https://reviews.llvm.org/D107157

(cherry picked from commit 1172a8a7639399fe0b8a6c78a7123b1c3f9cf833)

Added: 


Modified: 
llvm/test/Transforms/LoopVectorize/assume.ll

Removed: 




diff  --git a/llvm/test/Transforms/LoopVectorize/assume.ll 
b/llvm/test/Transforms/LoopVectorize/assume.ll
index b833322967080..10cb67fd4e6a0 100644
--- a/llvm/test/Transforms/LoopVectorize/assume.ll
+++ b/llvm/test/Transforms/LoopVectorize/assume.ll
@@ -1,72 +1,23 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
 ; RUN: opt < %s  -loop-vectorize -force-vector-width=2 
-force-vector-interleave=2  -S | FileCheck %s
 
 define void @test1(float* noalias nocapture %a, float* noalias nocapture 
readonly %b) {
 ; CHECK-LABEL: @test1(
-; CHECK-NEXT:  entry:
-; CHECK-NEXT:br i1 false, label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
-; CHECK:   vector.ph:
-; CHECK-NEXT:br label [[VECTOR_BODY:%.*]]
 ; CHECK:   vector.body:
-; CHECK-NEXT:[[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ 
[[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
-; CHECK-NEXT:[[TMP0:%.*]] = add i64 [[INDEX]], 0
-; CHECK-NEXT:[[TMP1:%.*]] = add i64 [[INDEX]], 2
-; CHECK-NEXT:[[TMP2:%.*]] = getelementptr inbounds float, float* 
[[B:%.*]], i64 [[TMP0]]
-; CHECK-NEXT:[[TMP3:%.*]] = getelementptr inbounds float, float* [[B]], 
i64 [[TMP1]]
-; CHECK-NEXT:[[TMP4:%.*]] = getelementptr inbounds float, float* [[TMP2]], 
i32 0
-; CHECK-NEXT:[[TMP5:%.*]] = bitcast float* [[TMP4]] to <2 x float>*
-; CHECK-NEXT:[[WIDE_LOAD:%.*]] = load <2 x float>, <2 x float>* [[TMP5]], 
align 4
-; CHECK-NEXT:[[TMP6:%.*]] = getelementptr inbounds float, float* [[TMP2]], 
i32 2
-; CHECK-NEXT:[[TMP7:%.*]] = bitcast float* [[TMP6]] to <2 x float>*
-; CHECK-NEXT:[[WIDE_LOAD1:%.*]] = load <2 x float>, <2 x float>* [[TMP7]], 
align 4
-; CHECK-NEXT:[[TMP8:%.*]] = fcmp ogt <2 x float> [[WIDE_LOAD]], 
-; CHECK-NEXT:[[TMP9:%.*]] = fcmp ogt <2 x float> [[WIDE_LOAD1]], 
-; CHECK-NEXT:[[TMP10:%.*]] = extractelement <2 x i1> [[TMP8]], i32 0
-; CHECK-NEXT:tail call void @llvm.assume(i1 [[TMP10]])
-; CHECK-NEXT:[[TMP11:%.*]] = extractelement <2 x i1> [[TMP8]], i32 1
-; CHECK-NEXT:tail call void @llvm.assume(i1 [[TMP11]])
-; CHECK-NEXT:[[TMP12:%.*]] = extractelement <2 x i1> [[TMP9]], i32 0
-; CHECK-NEXT:tail call void @llvm.assume(i1 [[TMP12]])
-; CHECK-NEXT:[[TMP13:%.*]] = extractelement <2 x i1> [[TMP9]], i32 1
-; CHECK-NEXT:tail call void @llvm.assume(i1 [[TMP13]])
-; CHECK-NEXT:[[TMP14:%.*]] = fadd <2 x float> [[WIDE_LOAD]], 
-; CHECK-NEXT:[[TMP15:%.*]] = fadd <2 x float> [[WIDE_LOAD1]], 
-; CHECK-NEXT:[[TMP16:%.*]] = getelementptr inbounds float, float* 
[[A:%.*]], i64 [[TMP0]]
-; CHECK-NEXT:[[TMP17:%.*]] = getelementptr inbounds float, float* [[A]], 
i64 [[TMP1]]
-; CHECK-NEXT:[[TMP18:%.*]] = getelementptr inbounds float, float* 
[[TMP16]], i32 0
-; CHECK-NEXT:[[TMP19:%.*]] = bitcast float* [[TMP18]] to <2 x float>*
-; CHECK-NEXT:store <2 x float> [[TMP14]], <2 x float>* [[TMP19]], align 4
-; CHECK-NEXT:[[TMP20:%.*]] = getelementptr inbounds float, float* 
[[TMP16]], i32 2
-; CHECK-NEXT:[[TMP21:%.*]] = bitcast float* [[TMP20]] to <2 x float>*
-; CHECK-NEXT:store <2 x float> [[TMP15]], <2 x float>* [[TMP21]], align 4
-; CHECK-NEXT:[[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
-; CHECK-NEXT:[[TMP22:%.*]] = icmp eq i64 [[INDEX_NEXT]], 1600
-; CHECK-NEXT:br i1 [[TMP22]], label [[MIDDLE_BLOCK:%.*]], label 
[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
-; CHECK:   middle.block:
-; CHECK-NEXT:[[CMP_N:%.*]] = icmp eq i64 1600, 1600
-; CHECK-NEXT:br i1 [[CMP_N]], label [[FOR_END:%.*]], label [[SCALAR_PH]]
-; CHECK:   scalar.ph:
-; CHECK-NEXT:[[BC_RESUME_VAL:%.*]] = phi i64 [ 1600, [[MIDDLE_BLOCK]] ], [ 
0, [[ENTRY:%.*]] ]
-; CHECK-NEXT:br label [[FOR_BODY:%.*]]
-; CHECK:   for.body:
-; CHECK-NEXT:[[INDVARS_IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], 
[[SCALAR_PH]] ], [ [[INDVARS_IV_NEXT:%.*]], [[FOR_BODY]] ]
-; CHECK-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds float, float* 
[[B]], i64 [[INDVARS_IV]]
-; CHECK-NEXT:[[TMP23:%.*]] = load float, float* [[ARRAYIDX]], align 4
-; CHECK-NEXT:[[CMP1:%.*]] = fcmp ogt float [[TMP23]], 1.00e+02
-; CHECK-NEXT:

[llvm-branch-commits] [llvm] a57d981 - [LoopVectorize] Improve vectorisation of some intrinsics by treating them as uniform

2021-08-16 Thread Tom Stellard via llvm-branch-commits

Author: David Sherwood
Date: 2021-08-16T11:32:41-07:00
New Revision: a57d98111e63d390686f30b07fd1b1227ab99086

URL: 
https://github.com/llvm/llvm-project/commit/a57d98111e63d390686f30b07fd1b1227ab99086
DIFF: 
https://github.com/llvm/llvm-project/commit/a57d98111e63d390686f30b07fd1b1227ab99086.diff

LOG: [LoopVectorize] Improve vectorisation of some intrinsics by treating them 
as uniform

This patch adds more instructions to the Uniforms list, for example certain
intrinsics that are uniform by definition or whose operands are loop invariant.
This list includes:

  1. The intrinsics 'experimental.noalias.scope.decl' and 'sideeffect', which
  are always uniform by definition.
  2. If intrinsics 'lifetime.start', 'lifetime.end' and 'assume' have
  loop invariant input operands then these are also uniform too.

Also, in VPRecipeBuilder::handleReplication we check if an instruction is
uniform based purely on whether or not the instruction lives in the Uniforms
list. However, there are certain cases where calls to some intrinsics can
be effectively treated as uniform too. Therefore, we now also treat the
following cases as uniform for scalable vectors:

  1. If the 'assume' intrinsic's operand is not loop invariant, then we
  are free to treat this as uniform anyway since it's only a performance
  hint. We will get the benefit for the first lane.
  2. When the input pointers for 'lifetime.start' and 'lifetime.end' are loop
  variant then for scalable vectors we assume these still ultimately come
  from the broadcast of an alloca. We do not support scalable vectorisation
  of loops containing alloca instructions, hence the alloca itself would
  be invariant. If the pointer does not come from an alloca then the
  intrinsic itself has no effect.

I have updated the assume test for fixed width, since we now treat it
as uniform:

  Transforms/LoopVectorize/assume.ll

I've also added new scalable vectorisation tests for other intriniscs:

  Transforms/LoopVectorize/scalable-assume.ll
  Transforms/LoopVectorize/scalable-lifetime.ll
  Transforms/LoopVectorize/scalable-noalias-scope-decl.ll

Differential Revision: https://reviews.llvm.org/D107284

(cherry picked from commit 3fd96e1b2e129b981f1bc1be2615486187e74687)

Added: 


Modified: 
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
llvm/test/Transforms/LoopVectorize/assume.ll

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index f24ae6b100d57..671bc6b5212b5 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -5433,6 +5433,21 @@ void 
LoopVectorizationCostModel::collectLoopUniforms(ElementCount VF) {
   // lane 0 demanded or b) are uses which demand only lane 0 of their operand.
   for (auto *BB : TheLoop->blocks())
 for (auto &I : *BB) {
+  if (IntrinsicInst *II = dyn_cast(&I)) {
+switch (II->getIntrinsicID()) {
+case Intrinsic::sideeffect:
+case Intrinsic::experimental_noalias_scope_decl:
+case Intrinsic::assume:
+case Intrinsic::lifetime_start:
+case Intrinsic::lifetime_end:
+  if (TheLoop->hasLoopInvariantOperands(&I))
+addToWorklistIfAllowed(&I);
+  break;
+default:
+  break;
+}
+  }
+
   // If there's no pointer operand, there's nothing to do.
   auto *Ptr = getLoadStorePointerOperand(&I);
   if (!Ptr)
@@ -8916,6 +8931,37 @@ VPBasicBlock *VPRecipeBuilder::handleReplication(
   bool IsPredicated = LoopVectorizationPlanner::getDecisionAndClampRange(
   [&](ElementCount VF) { return CM.isPredicatedInst(I); }, Range);
 
+  // Even if the instruction is not marked as uniform, there are certain
+  // intrinsic calls that can be effectively treated as such, so we check for
+  // them here. Conservatively, we only do this for scalable vectors, since
+  // for fixed-width VFs we can always fall back on full scalarization.
+  if (!IsUniform && Range.Start.isScalable() && isa(I)) {
+switch (cast(I)->getIntrinsicID()) {
+case Intrinsic::assume:
+case Intrinsic::lifetime_start:
+case Intrinsic::lifetime_end:
+  // For scalable vectors if one of the operands is variant then we still
+  // want to mark as uniform, which will generate one instruction for just
+  // the first lane of the vector. We can't scalarize the call in the same
+  // way as for fixed-width vectors because we don't know how many lanes
+  // there are.
+  //
+  // The reasons for doing it this way for scalable vectors are:
+  //   1. For the assume intrinsic generating the instruction for the first
+  //  lane is still be better than not generating any at all. For
+  //  example, the input may be a splat across all lanes.
+  //   2. For the lifetime start/end intrinsics the poin

[llvm-branch-commits] [clang] 0dd4f00 - [OpenMP]Fix PR50336: Remove temporary files in the offload bundler tool

2021-08-16 Thread Tom Stellard via llvm-branch-commits

Author: Joseph Huber
Date: 2021-08-16T11:34:06-07:00
New Revision: 0dd4f002e1d3ccddfc519e02f9ab5cdd9c8d26d1

URL: 
https://github.com/llvm/llvm-project/commit/0dd4f002e1d3ccddfc519e02f9ab5cdd9c8d26d1
DIFF: 
https://github.com/llvm/llvm-project/commit/0dd4f002e1d3ccddfc519e02f9ab5cdd9c8d26d1.diff

LOG: [OpenMP]Fix PR50336: Remove temporary files in the offload bundler tool

Temporary files created by the offloading device toolchain are not removed
after compilation when using a two-step compilation. The offload-bundler uses a
different filename for the device binary than the `.o` file present in the
Job's input list. This is not listed as a temporary file so it is never
removed. This patch explicitly adds the device binary as a temporary file to
consume it. This fixes PR50336.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D107668

(cherry picked from commit 01d59c0de822099c62f12f275c41338f6df9f5ac)

Added: 


Modified: 
clang/lib/Driver/ToolChains/Clang.cpp

Removed: 




diff  --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 1870bd81789c5..0e129e6f2fac8 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -7690,8 +7690,11 @@ void OffloadBundler::ConstructJob(Compilation &C, const 
JobAction &JA,
 assert(CurTC == nullptr && "Expected one dependence!");
 CurTC = TC;
   });
+  UB += C.addTempFile(
+  C.getArgs().MakeArgString(CurTC->getInputFilename(Inputs[I])));
+} else {
+  UB += CurTC->getInputFilename(Inputs[I]);
 }
-UB += CurTC->getInputFilename(Inputs[I]);
   }
   CmdArgs.push_back(TCArgs.MakeArgString(UB));
 



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] ba04851 - [InstSimplify] add tests for min/max idioms; NFC

2021-08-16 Thread Tom Stellard via llvm-branch-commits

Author: Sanjay Patel
Date: 2021-08-16T11:35:24-07:00
New Revision: ba048518e08fcb227359fae94da2a10dd37d2139

URL: 
https://github.com/llvm/llvm-project/commit/ba048518e08fcb227359fae94da2a10dd37d2139
DIFF: 
https://github.com/llvm/llvm-project/commit/ba048518e08fcb227359fae94da2a10dd37d2139.diff

LOG: [InstSimplify] add tests for min/max idioms; NFC

(cherry picked from commit 9b942a545cb53d4bae2071a2dea513be74f68221)

Added: 


Modified: 
llvm/test/Transforms/InstSimplify/maxmin.ll

Removed: 




diff  --git a/llvm/test/Transforms/InstSimplify/maxmin.ll 
b/llvm/test/Transforms/InstSimplify/maxmin.ll
index 3fcbfec2f63ad..e9fff33f63114 100644
--- a/llvm/test/Transforms/InstSimplify/maxmin.ll
+++ b/llvm/test/Transforms/InstSimplify/maxmin.ll
@@ -1,9 +1,75 @@
-; NOTE: Assertions have been autogenerated by update_test_checks.py
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
 ; RUN: opt < %s -instsimplify -S | FileCheck %s
 
+define i8 @smax_min_limit(i8 %x) {
+; CHECK-LABEL: @smax_min_limit(
+; CHECK-NEXT:[[CMP:%.*]] = icmp sgt i8 [[X:%.*]], -128
+; CHECK-NEXT:[[SEL:%.*]] = select i1 [[CMP]], i8 [[X]], i8 -128
+; CHECK-NEXT:ret i8 [[SEL]]
+;
+  %cmp = icmp sgt i8 %x, -128
+  %sel = select i1 %cmp, i8 %x, i8 -128
+  ret i8 %sel
+}
+
+define i8 @smin_max_limit(i8 %x) {
+; CHECK-LABEL: @smin_max_limit(
+; CHECK-NEXT:[[CMP:%.*]] = icmp slt i8 [[X:%.*]], 127
+; CHECK-NEXT:[[SEL:%.*]] = select i1 [[CMP]], i8 [[X]], i8 127
+; CHECK-NEXT:ret i8 [[SEL]]
+;
+  %cmp = icmp slt i8 %x, 127
+  %sel = select i1 %cmp, i8 %x, i8 127
+  ret i8 %sel
+}
+
+define <2 x i8> @umax_min_limit(<2 x i8> %x) {
+; CHECK-LABEL: @umax_min_limit(
+; CHECK-NEXT:[[CMP:%.*]] = icmp ugt <2 x i8> [[X:%.*]], zeroinitializer
+; CHECK-NEXT:[[SEL:%.*]] = select <2 x i1> [[CMP]], <2 x i8> [[X]], <2 x 
i8> zeroinitializer
+; CHECK-NEXT:ret <2 x i8> [[SEL]]
+;
+  %cmp = icmp ugt <2 x i8> %x, zeroinitializer
+  %sel = select <2 x i1> %cmp, <2 x i8> %x, <2 x i8> zeroinitializer
+  ret <2 x i8> %sel
+}
+
+define i8 @umin_max_limit(i8 %x) {
+; CHECK-LABEL: @umin_max_limit(
+; CHECK-NEXT:[[CMP:%.*]] = icmp ult i8 [[X:%.*]], -1
+; CHECK-NEXT:[[SEL:%.*]] = select i1 [[CMP]], i8 [[X]], i8 -1
+; CHECK-NEXT:ret i8 [[SEL]]
+;
+  %cmp = icmp ult i8 %x, 255
+  %sel = select i1 %cmp, i8 %x, i8 255
+  ret i8 %sel
+}
+
+define i8 @smax_not_min_limit(i8 %x) {
+; CHECK-LABEL: @smax_not_min_limit(
+; CHECK-NEXT:[[CMP:%.*]] = icmp sgt i8 [[X:%.*]], -127
+; CHECK-NEXT:[[SEL:%.*]] = select i1 [[CMP]], i8 [[X]], i8 -127
+; CHECK-NEXT:ret i8 [[SEL]]
+;
+  %cmp = icmp sgt i8 %x, -127
+  %sel = select i1 %cmp, i8 %x, i8 -127
+  ret i8 %sel
+}
+
+define i8 @smin_not_min_limit(i8 %x) {
+; CHECK-LABEL: @smin_not_min_limit(
+; CHECK-NEXT:[[CMP:%.*]] = icmp slt i8 [[X:%.*]], 0
+; CHECK-NEXT:[[SEL:%.*]] = select i1 [[CMP]], i8 [[X]], i8 0
+; CHECK-NEXT:ret i8 [[SEL]]
+;
+  %cmp = icmp slt i8 %x, 0
+  %sel = select i1 %cmp, i8 %x, i8 0
+  ret i8 %sel
+}
+
 define i1 @max1(i32 %x, i32 %y) {
 ; CHECK-LABEL: @max1(
-; CHECK: ret i1 false
+; CHECK-NEXT:ret i1 false
 ;
   %c = icmp sgt i32 %x, %y
   %m = select i1 %c, i32 %x, i32 %y
@@ -13,7 +79,7 @@ define i1 @max1(i32 %x, i32 %y) {
 
 define i1 @max2(i32 %x, i32 %y) {
 ; CHECK-LABEL: @max2(
-; CHECK: ret i1 true
+; CHECK-NEXT:ret i1 true
 ;
   %c = icmp sge i32 %x, %y
   %m = select i1 %c, i32 %x, i32 %y
@@ -23,7 +89,7 @@ define i1 @max2(i32 %x, i32 %y) {
 
 define i1 @max3(i32 %x, i32 %y) {
 ; CHECK-LABEL: @max3(
-; CHECK: ret i1 false
+; CHECK-NEXT:ret i1 false
 ;
   %c = icmp ugt i32 %x, %y
   %m = select i1 %c, i32 %x, i32 %y
@@ -33,7 +99,7 @@ define i1 @max3(i32 %x, i32 %y) {
 
 define i1 @max4(i32 %x, i32 %y) {
 ; CHECK-LABEL: @max4(
-; CHECK: ret i1 true
+; CHECK-NEXT:ret i1 true
 ;
   %c = icmp uge i32 %x, %y
   %m = select i1 %c, i32 %x, i32 %y
@@ -43,7 +109,7 @@ define i1 @max4(i32 %x, i32 %y) {
 
 define i1 @max5(i32 %x, i32 %y) {
 ; CHECK-LABEL: @max5(
-; CHECK: ret i1 false
+; CHECK-NEXT:ret i1 false
 ;
   %c = icmp sgt i32 %x, %y
   %m = select i1 %c, i32 %x, i32 %y
@@ -53,7 +119,7 @@ define i1 @max5(i32 %x, i32 %y) {
 
 define i1 @max6(i32 %x, i32 %y) {
 ; CHECK-LABEL: @max6(
-; CHECK: ret i1 true
+; CHECK-NEXT:ret i1 true
 ;
   %c = icmp sge i32 %x, %y
   %m = select i1 %c, i32 %x, i32 %y
@@ -63,7 +129,7 @@ define i1 @max6(i32 %x, i32 %y) {
 
 define i1 @max7(i32 %x, i32 %y) {
 ; CHECK-LABEL: @max7(
-; CHECK: ret i1 false
+; CHECK-NEXT:ret i1 false
 ;
   %c = icmp ugt i32 %x, %y
   %m = select i1 %c, i32 %x, i32 %y
@@ -73,7 +139,7 @@ define i1 @max7(i32 %x, i32 %y) {
 
 define i1 @max8(i32 %x, i32 %y) {
 ; CHECK-LABEL: @max8(
-; CHECK: ret i1 true
+; CHECK-NEXT:ret i1 true
 ;
   %c = icmp uge i32 %x, %y
   %m = select i1 %c, i32 %x, i32 %y

[llvm-branch-commits] [llvm] f4006c5 - [InstSimplify] fold min/max with limit constant

2021-08-16 Thread Tom Stellard via llvm-branch-commits

Author: Sanjay Patel
Date: 2021-08-16T11:35:29-07:00
New Revision: f4006c59497d425f5a3df5e68da5add21f8e467d

URL: 
https://github.com/llvm/llvm-project/commit/f4006c59497d425f5a3df5e68da5add21f8e467d
DIFF: 
https://github.com/llvm/llvm-project/commit/f4006c59497d425f5a3df5e68da5add21f8e467d.diff

LOG: [InstSimplify] fold min/max with limit constant

This is already done within InstCombine:
https://alive2.llvm.org/ce/z/MiGE22

...but leaving it out of analysis makes it
harder to avoid infinite loops there.

(cherry picked from commit e260e10c4a21784c146c94a2a14b7e78b09a9cf7)

Added: 


Modified: 
llvm/include/llvm/Analysis/ValueTracking.h
llvm/lib/Analysis/InstructionSimplify.cpp
llvm/lib/Analysis/ValueTracking.cpp
llvm/test/Transforms/InstSimplify/maxmin.ll

Removed: 




diff  --git a/llvm/include/llvm/Analysis/ValueTracking.h 
b/llvm/include/llvm/Analysis/ValueTracking.h
index 90ec742f18e67..f46e66641c082 100644
--- a/llvm/include/llvm/Analysis/ValueTracking.h
+++ b/llvm/include/llvm/Analysis/ValueTracking.h
@@ -744,6 +744,10 @@ constexpr unsigned MaxAnalysisRecursionDepth = 6;
   /// minimum/maximum flavor.
   CmpInst::Predicate getInverseMinMaxPred(SelectPatternFlavor SPF);
 
+  /// Return the minimum or maximum constant value for the specified integer
+  /// min/max flavor and type.
+  APInt getMinMaxLimit(SelectPatternFlavor SPF, unsigned BitWidth);
+
   /// Check if the values in \p VL are select instructions that can be 
converted
   /// to a min or max (vector) intrinsic. Returns the intrinsic ID, if such a
   /// conversion is possible, together with a bool indicating whether all 
select

diff  --git a/llvm/lib/Analysis/InstructionSimplify.cpp 
b/llvm/lib/Analysis/InstructionSimplify.cpp
index 23083bc8178e7..69ab0052b0a74 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -4080,6 +4080,22 @@ static Value *simplifySelectWithICmpCond(Value *CondVal, 
Value *TrueVal,
 std::swap(TrueVal, FalseVal);
   }
 
+  // Check for integer min/max with a limit constant:
+  // X > MIN_INT ? X : MIN_INT --> X
+  // X < MAX_INT ? X : MAX_INT --> X
+  if (TrueVal->getType()->isIntOrIntVectorTy()) {
+Value *X, *Y;
+SelectPatternFlavor SPF =
+matchDecomposedSelectPattern(cast(CondVal), TrueVal, 
FalseVal,
+ X, Y).Flavor;
+if (SelectPatternResult::isMinOrMax(SPF) && Pred == getMinMaxPred(SPF)) {
+  APInt LimitC = getMinMaxLimit(getInverseMinMaxFlavor(SPF),
+X->getType()->getScalarSizeInBits());
+  if (match(Y, m_SpecificInt(LimitC)))
+return X;
+}
+  }
+
   if (Pred == ICmpInst::ICMP_EQ && match(CmpRHS, m_Zero())) {
 Value *X;
 const APInt *Y;

diff  --git a/llvm/lib/Analysis/ValueTracking.cpp 
b/llvm/lib/Analysis/ValueTracking.cpp
index 522d21812c6a5..6e3ca5c4e08ae 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -6253,6 +6253,16 @@ CmpInst::Predicate 
llvm::getInverseMinMaxPred(SelectPatternFlavor SPF) {
   return getMinMaxPred(getInverseMinMaxFlavor(SPF));
 }
 
+APInt llvm::getMinMaxLimit(SelectPatternFlavor SPF, unsigned BitWidth) {
+  switch (SPF) {
+  case SPF_SMAX: return APInt::getSignedMaxValue(BitWidth);
+  case SPF_SMIN: return APInt::getSignedMinValue(BitWidth);
+  case SPF_UMAX: return APInt::getMaxValue(BitWidth);
+  case SPF_UMIN: return APInt::getMinValue(BitWidth);
+  default: llvm_unreachable("Unexpected flavor");
+  }
+}
+
 std::pair
 llvm::canConvertToMinOrMaxIntrinsic(ArrayRef VL) {
   // Check if VL contains select instructions that can be folded into a min/max

diff  --git a/llvm/test/Transforms/InstSimplify/maxmin.ll 
b/llvm/test/Transforms/InstSimplify/maxmin.ll
index e9fff33f63114..1dea800e6ecd0 100644
--- a/llvm/test/Transforms/InstSimplify/maxmin.ll
+++ b/llvm/test/Transforms/InstSimplify/maxmin.ll
@@ -3,9 +3,7 @@
 
 define i8 @smax_min_limit(i8 %x) {
 ; CHECK-LABEL: @smax_min_limit(
-; CHECK-NEXT:[[CMP:%.*]] = icmp sgt i8 [[X:%.*]], -128
-; CHECK-NEXT:[[SEL:%.*]] = select i1 [[CMP]], i8 [[X]], i8 -128
-; CHECK-NEXT:ret i8 [[SEL]]
+; CHECK-NEXT:ret i8 [[X:%.*]]
 ;
   %cmp = icmp sgt i8 %x, -128
   %sel = select i1 %cmp, i8 %x, i8 -128
@@ -14,9 +12,7 @@ define i8 @smax_min_limit(i8 %x) {
 
 define i8 @smin_max_limit(i8 %x) {
 ; CHECK-LABEL: @smin_max_limit(
-; CHECK-NEXT:[[CMP:%.*]] = icmp slt i8 [[X:%.*]], 127
-; CHECK-NEXT:[[SEL:%.*]] = select i1 [[CMP]], i8 [[X]], i8 127
-; CHECK-NEXT:ret i8 [[SEL]]
+; CHECK-NEXT:ret i8 [[X:%.*]]
 ;
   %cmp = icmp slt i8 %x, 127
   %sel = select i1 %cmp, i8 %x, i8 127
@@ -25,9 +21,7 @@ define i8 @smin_max_limit(i8 %x) {
 
 define <2 x i8> @umax_min_limit(<2 x i8> %x) {
 ; CHECK-LABEL: @umax_min_limit(
-; CHECK-NEXT:[[CMP:%.*]] = icmp ugt <2 x i8> [[X:%.*]], zeroinitializer
-; CHECK-NEXT:[[SEL:

[llvm-branch-commits] [llvm] 5b60faa - [InstCombine] avoid infinite loops from min/max canonicalization

2021-08-16 Thread Tom Stellard via llvm-branch-commits

Author: Sanjay Patel
Date: 2021-08-16T11:35:38-07:00
New Revision: 5b60faae3f10e9a7bdf0c1b5b3d123265ce55407

URL: 
https://github.com/llvm/llvm-project/commit/5b60faae3f10e9a7bdf0c1b5b3d123265ce55407
DIFF: 
https://github.com/llvm/llvm-project/commit/5b60faae3f10e9a7bdf0c1b5b3d123265ce55407.diff

LOG: [InstCombine] avoid infinite loops from min/max canonicalization

The intrinsics have an extra chunk of known bits logic
compared to the normal cmp+select idiom. That allows
folding the icmp in each case to something better, but
that then opposes the canonical form of min/max that
we try to form for a select.

I'm carving out a narrow exception to preserve all
existing regression tests while avoiding the inf-loop.
It seems unlikely that this is the only bug like this
left, but this should fix:
https://llvm.org/PR51419

(cherry picked from commit b267d3ce8defa092600bda717ff18440d002f316)

Added: 


Modified: 
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
llvm/test/Transforms/InstCombine/select-min-max.ll

Removed: 




diff  --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp 
b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 2b0ef0c5f2cce..c5e14ebf3ae30 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -5158,6 +5158,83 @@ Instruction 
*InstCombinerImpl::foldICmpUsingKnownBits(ICmpInst &I) {
   if (!isa(Op1) && Op1Min == Op1Max)
 return new ICmpInst(Pred, Op0, ConstantExpr::getIntegerValue(Ty, Op1Min));
 
+  // Don't break up a clamp pattern -- (min(max X, Y), Z) -- by replacing a
+  // min/max canonical compare with some other compare. That could lead to
+  // conflict with select canonicalization and infinite looping.
+  // FIXME: This constraint may go away if min/max intrinsics are canonical.
+  auto isMinMaxCmp = [&](Instruction &Cmp) {
+if (!Cmp.hasOneUse())
+  return false;
+Value *A, *B;
+SelectPatternFlavor SPF = matchSelectPattern(Cmp.user_back(), A, B).Flavor;
+if (!SelectPatternResult::isMinOrMax(SPF))
+  return false;
+return match(Op0, m_MaxOrMin(m_Value(), m_Value())) ||
+   match(Op1, m_MaxOrMin(m_Value(), m_Value()));
+  };
+  if (!isMinMaxCmp(I)) {
+switch (Pred) {
+default:
+  break;
+case ICmpInst::ICMP_ULT: {
+  if (Op1Min == Op0Max) // A  A != B if max(A) == min(B)
+return new ICmpInst(ICmpInst::ICMP_NE, Op0, Op1);
+  const APInt *CmpC;
+  if (match(Op1, m_APInt(CmpC))) {
+// A  A == C-1 if min(A)+1 == C
+if (*CmpC == Op0Min + 1)
+  return new ICmpInst(ICmpInst::ICMP_EQ, Op0,
+  ConstantInt::get(Op1->getType(), *CmpC - 1));
+// X  X == 0, if the number of zero bits in the bottom of X
+// exceeds the log2 of C.
+if (Op0Known.countMinTrailingZeros() >= CmpC->ceilLogBase2())
+  return new ICmpInst(ICmpInst::ICMP_EQ, Op0,
+  Constant::getNullValue(Op1->getType()));
+  }
+  break;
+}
+case ICmpInst::ICMP_UGT: {
+  if (Op1Max == Op0Min) // A >u B -> A != B if min(A) == max(B)
+return new ICmpInst(ICmpInst::ICMP_NE, Op0, Op1);
+  const APInt *CmpC;
+  if (match(Op1, m_APInt(CmpC))) {
+// A >u C -> A == C+1 if max(a)-1 == C
+if (*CmpC == Op0Max - 1)
+  return new ICmpInst(ICmpInst::ICMP_EQ, Op0,
+  ConstantInt::get(Op1->getType(), *CmpC + 1));
+// X >u C --> X != 0, if the number of zero bits in the bottom of X
+// exceeds the log2 of C.
+if (Op0Known.countMinTrailingZeros() >= CmpC->getActiveBits())
+  return new ICmpInst(ICmpInst::ICMP_NE, Op0,
+  Constant::getNullValue(Op1->getType()));
+  }
+  break;
+}
+case ICmpInst::ICMP_SLT: {
+  if (Op1Min == Op0Max) // A  A != B if max(A) == min(B)
+return new ICmpInst(ICmpInst::ICMP_NE, Op0, Op1);
+  const APInt *CmpC;
+  if (match(Op1, m_APInt(CmpC))) {
+if (*CmpC == Op0Min + 1) // A  A == C-1 if min(A)+1 == C
+  return new ICmpInst(ICmpInst::ICMP_EQ, Op0,
+  ConstantInt::get(Op1->getType(), *CmpC - 1));
+  }
+  break;
+}
+case ICmpInst::ICMP_SGT: {
+  if (Op1Max == Op0Min) // A >s B -> A != B if min(A) == max(B)
+return new ICmpInst(ICmpInst::ICMP_NE, Op0, Op1);
+  const APInt *CmpC;
+  if (match(Op1, m_APInt(CmpC))) {
+if (*CmpC == Op0Max - 1) // A >s C -> A == C+1 if max(A)-1 == C
+  return new ICmpInst(ICmpInst::ICMP_EQ, Op0,
+  ConstantInt::get(Op1->getType(), *CmpC + 1));
+  }
+  break;
+}
+}
+  }
+
   // Based on the range information we know about the LHS, see if we can
   // simplify this comparison.  For example, (x&4) < 8 is 

[llvm-branch-commits] [llvm] 24d8b65 - [Attributor][FIX] Guard constant casts with type size checks

2021-08-16 Thread Tom Stellard via llvm-branch-commits

Author: Johannes Doerfert
Date: 2021-08-16T11:36:30-07:00
New Revision: 24d8b6565a2e298f65df7b63cdeb874ce9f4ed91

URL: 
https://github.com/llvm/llvm-project/commit/24d8b6565a2e298f65df7b63cdeb874ce9f4ed91
DIFF: 
https://github.com/llvm/llvm-project/commit/24d8b6565a2e298f65df7b63cdeb874ce9f4ed91.diff

LOG: [Attributor][FIX] Guard constant casts with type size checks

(cherry picked from commit 5f543919b2646d36f2ddc1424acdd555bfcebe4f)

Added: 


Modified: 
llvm/lib/Transforms/IPO/Attributor.cpp
llvm/test/Transforms/Attributor/value-simplify-pointer-info.ll

Removed: 




diff  --git a/llvm/lib/Transforms/IPO/Attributor.cpp 
b/llvm/lib/Transforms/IPO/Attributor.cpp
index 5fecbf35fef02..91b16ec66ee39 100644
--- a/llvm/lib/Transforms/IPO/Attributor.cpp
+++ b/llvm/lib/Transforms/IPO/Attributor.cpp
@@ -251,10 +251,12 @@ Value *AA::getWithType(Value &V, Type &Ty) {
   return Constant::getNullValue(&Ty);
 if (C->getType()->isPointerTy() && Ty.isPointerTy())
   return ConstantExpr::getPointerCast(C, &Ty);
-if (C->getType()->isIntegerTy() && Ty.isIntegerTy())
-  return ConstantExpr::getTrunc(C, &Ty, /* OnlyIfReduced */ true);
-if (C->getType()->isFloatingPointTy() && Ty.isFloatingPointTy())
-  return ConstantExpr::getFPTrunc(C, &Ty, /* OnlyIfReduced */ true);
+if (C->getType()->getPrimitiveSizeInBits() >= Ty.getPrimitiveSizeInBits()) 
{
+  if (C->getType()->isIntegerTy() && Ty.isIntegerTy())
+return ConstantExpr::getTrunc(C, &Ty, /* OnlyIfReduced */ true);
+  if (C->getType()->isFloatingPointTy() && Ty.isFloatingPointTy())
+return ConstantExpr::getFPTrunc(C, &Ty, /* OnlyIfReduced */ true);
+}
   }
   return nullptr;
 }

diff  --git a/llvm/test/Transforms/Attributor/value-simplify-pointer-info.ll 
b/llvm/test/Transforms/Attributor/value-simplify-pointer-info.ll
index fdb974f003a98..80b636951aa36 100644
--- a/llvm/test/Transforms/Attributor/value-simplify-pointer-info.ll
+++ b/llvm/test/Transforms/Attributor/value-simplify-pointer-info.ll
@@ -26,6 +26,8 @@
 @a1 = internal global i32 zeroinitializer
 @a2 = internal global i32 zeroinitializer
 @a3 = internal global i32 undef
+@bytes1 = internal global i32 undef
+@bytes2 = internal global i32 undef
 
 ;.
 ; CHECK: @[[GLOBALBYTES:[a-zA-Z0-9_$"\\.-]+]] = global [1024 x i8] 
zeroinitializer, align 16
@@ -48,6 +50,8 @@
 ; CHECK: @[[A1:[a-zA-Z0-9_$"\\.-]+]] = internal global i32 0
 ; CHECK: @[[A2:[a-zA-Z0-9_$"\\.-]+]] = internal global i32 0
 ; CHECK: @[[A3:[a-zA-Z0-9_$"\\.-]+]] = internal global i32 undef
+; CHECK: @[[BYTES1:[a-zA-Z0-9_$"\\.-]+]] = internal global i32 undef
+; CHECK: @[[BYTES2:[a-zA-Z0-9_$"\\.-]+]] = internal global i32 undef
 ;.
 define void @write_arg(i32* %p, i32 %v) {
 ; IS__TUNIT: Function Attrs: argmemonly nofree nosync nounwind willreturn 
writeonly
@@ -3230,6 +3234,62 @@ end:
   ret i8 %add
 }
 
+define i8 @cast_and_load_1() {
+; IS__TUNIT_OPM: Function Attrs: nofree nosync nounwind willreturn
+; IS__TUNIT_OPM-LABEL: define {{[^@]+}}@cast_and_load_1
+; IS__TUNIT_OPM-SAME: () #[[ATTR4]] {
+; IS__TUNIT_OPM-NEXT:store i32 42, i32* @bytes1, align 4
+; IS__TUNIT_OPM-NEXT:[[L:%.*]] = load i8, i8* bitcast (i32* @bytes1 to 
i8*), align 4
+; IS__TUNIT_OPM-NEXT:ret i8 [[L]]
+;
+; IS__TUNIT_NPM: Function Attrs: nofree nosync nounwind willreturn
+; IS__TUNIT_NPM-LABEL: define {{[^@]+}}@cast_and_load_1
+; IS__TUNIT_NPM-SAME: () #[[ATTR2]] {
+; IS__TUNIT_NPM-NEXT:store i32 42, i32* @bytes1, align 4
+; IS__TUNIT_NPM-NEXT:[[L:%.*]] = load i8, i8* bitcast (i32* @bytes1 to 
i8*), align 4
+; IS__TUNIT_NPM-NEXT:ret i8 [[L]]
+;
+; IS__CGSCC: Function Attrs: nofree norecurse nosync nounwind willreturn
+; IS__CGSCC-LABEL: define {{[^@]+}}@cast_and_load_1
+; IS__CGSCC-SAME: () #[[ATTR5]] {
+; IS__CGSCC-NEXT:store i32 42, i32* @bytes1, align 4
+; IS__CGSCC-NEXT:[[L:%.*]] = load i8, i8* bitcast (i32* @bytes1 to 
i8*), align 4
+; IS__CGSCC-NEXT:ret i8 [[L]]
+;
+  store i32 42, i32* @bytes1
+  %bc = bitcast i32* @bytes1 to i8*
+  %l = load i8, i8* %bc
+  ret i8 %l
+}
+
+define i64 @cast_and_load_2() {
+; IS__TUNIT_OPM: Function Attrs: nofree nosync nounwind willreturn
+; IS__TUNIT_OPM-LABEL: define {{[^@]+}}@cast_and_load_2
+; IS__TUNIT_OPM-SAME: () #[[ATTR4]] {
+; IS__TUNIT_OPM-NEXT:store i32 42, i32* @bytes2, align 4
+; IS__TUNIT_OPM-NEXT:[[L:%.*]] = load i64, i64* bitcast (i32* @bytes2 to 
i64*), align 4
+; IS__TUNIT_OPM-NEXT:ret i64 [[L]]
+;
+; IS__TUNIT_NPM: Function Attrs: nofree nosync nounwind willreturn
+; IS__TUNIT_NPM-LABEL: define {{[^@]+}}@cast_and_load_2
+; IS__TUNIT_NPM-SAME: () #[[ATTR2]] {
+; IS__TUNIT_NPM-NEXT:store i32 42, i32* @bytes2, align 4
+; IS__TUNIT_NPM-NEXT:[[L:%.*]] = load i64, i64* bitcast (i32* @bytes2 to 
i64*), align 4
+; IS__TUNIT_NPM-NEXT:ret i64 [[L]]
+;
+; IS__CGSCC: Function Attrs: nofree norecu

[llvm-branch-commits] [clang] d86e569 - [clang] [hexagon] Add resource include dir

2021-08-16 Thread Tom Stellard via llvm-branch-commits

Author: Brian Cain
Date: 2021-08-16T11:37:24-07:00
New Revision: d86e569e81191606fd813061e3133eebe2720b5c

URL: 
https://github.com/llvm/llvm-project/commit/d86e569e81191606fd813061e3133eebe2720b5c
DIFF: 
https://github.com/llvm/llvm-project/commit/d86e569e81191606fd813061e3133eebe2720b5c.diff

LOG: [clang] [hexagon] Add resource include dir

(cherry picked from commit 76ba272baf68ff38fcfc36c15ac2510bdea7)

Added: 


Modified: 
clang/lib/Driver/ToolChains/Hexagon.cpp
clang/test/Driver/hexagon-toolchain-linux.c

Removed: 




diff  --git a/clang/lib/Driver/ToolChains/Hexagon.cpp 
b/clang/lib/Driver/ToolChains/Hexagon.cpp
index 828bfdbb05a3c..314d0efce4414 100644
--- a/clang/lib/Driver/ToolChains/Hexagon.cpp
+++ b/clang/lib/Driver/ToolChains/Hexagon.cpp
@@ -588,21 +588,43 @@ void HexagonToolChain::addClangTargetOptions(const 
ArgList &DriverArgs,
 
 void HexagonToolChain::AddClangSystemIncludeArgs(const ArgList &DriverArgs,
  ArgStringList &CC1Args) const 
{
-  if (DriverArgs.hasArg(options::OPT_nostdinc) ||
-  DriverArgs.hasArg(options::OPT_nostdlibinc))
+  if (DriverArgs.hasArg(options::OPT_nostdinc))
 return;
 
+  const bool IsELF = !getTriple().isMusl() && !getTriple().isOSLinux();
+  const bool IsLinuxMusl = getTriple().isMusl() && getTriple().isOSLinux();
+
   const Driver &D = getDriver();
-  if (!D.SysRoot.empty()) {
+  SmallString<128> ResourceDirInclude(D.ResourceDir);
+  if (!IsELF) {
+llvm::sys::path::append(ResourceDirInclude, "include");
+if (!DriverArgs.hasArg(options::OPT_nobuiltininc) &&
+(!IsLinuxMusl || DriverArgs.hasArg(options::OPT_nostdlibinc)))
+  addSystemInclude(DriverArgs, CC1Args, ResourceDirInclude);
+  }
+  if (DriverArgs.hasArg(options::OPT_nostdlibinc))
+return;
+
+  const bool HasSysRoot = !D.SysRoot.empty();
+  if (HasSysRoot) {
 SmallString<128> P(D.SysRoot);
-if (getTriple().isMusl())
+if (IsLinuxMusl)
   llvm::sys::path::append(P, "usr/include");
 else
   llvm::sys::path::append(P, "include");
+
 addExternCSystemInclude(DriverArgs, CC1Args, P.str());
-return;
+// LOCAL_INCLUDE_DIR
+addSystemInclude(DriverArgs, CC1Args, P + "/usr/local/include");
+// TOOL_INCLUDE_DIR
+AddMultilibIncludeArgs(DriverArgs, CC1Args);
   }
 
+  if (!DriverArgs.hasArg(options::OPT_nobuiltininc) && IsLinuxMusl)
+addSystemInclude(DriverArgs, CC1Args, ResourceDirInclude);
+
+  if (HasSysRoot)
+return;
   std::string TargetDir = getHexagonTargetDir(D.getInstalledDir(),
   D.PrefixDirs);
   addExternCSystemInclude(DriverArgs, CC1Args, TargetDir + "/hexagon/include");

diff  --git a/clang/test/Driver/hexagon-toolchain-linux.c 
b/clang/test/Driver/hexagon-toolchain-linux.c
index 354a924f12098..da59590371b90 100644
--- a/clang/test/Driver/hexagon-toolchain-linux.c
+++ b/clang/test/Driver/hexagon-toolchain-linux.c
@@ -1,3 +1,5 @@
+// UNSUPPORTED: system-windows
+
 // 
-
 // Passing --musl
 // 
-
@@ -94,4 +96,26 @@
 // RUN:   -mcpu=hexagonv60 \
 // RUN:   %s 2>&1 \
 // RUN:   | FileCheck -check-prefix=CHECK007 %s
-// CHECK007:  "-internal-isystem" 
"{{.*}}hexagon{{/|}}include{{/|}}c++{{/|}}v1"
+// CHECK007:   "-internal-isystem" 
"{{.*}}hexagon{{/|}}include{{/|}}c++{{/|}}v1"
+// 
-
+// internal-isystem for linux with and without musl
+// 
-
+// RUN: %clang -### -target hexagon-unknown-linux-musl \
+// RUN:   -ccc-install-dir %S/Inputs/hexagon_tree/Tools/bin \
+// RUN:   -resource-dir=%S/Inputs/resource_dir \
+// RUN:   %s 2>&1 \
+// RUN:   | FileCheck -check-prefix=CHECK008 %s
+// CHECK008:   InstalledDir: [[INSTALLED_DIR:.+]]
+// CHECK008:   "-resource-dir" "[[RESOURCE:[^"]+]]"
+// CHECK008-SAME: {{^}} "-internal-isystem" "[[RESOURCE]]/include"
+// CHECK008-SAME: {{^}} "-internal-externc-isystem" 
"[[INSTALLED_DIR]]/../target/hexagon/include"
+
+// RUN: %clang -### -target hexagon-unknown-linux \
+// RUN:   -ccc-install-dir %S/Inputs/hexagon_tree/Tools/bin \
+// RUN:   -resource-dir=%S/Inputs/resource_dir \
+// RUN:   %s 2>&1 \
+// RUN:   | FileCheck -check-prefix=CHECK009 %s
+// CHECK009:   InstalledDir: [[INSTALLED_DIR:.+]]
+// CHECK009:   "-resource-dir" "[[RESOURCE:[^"]+]]"
+// CHECK009-SAME: {{^}} "-internal-isystem" "[[RESOURCE]]/include"
+// CHECK009-SAME: {{^}} "-internal-externc-isystem" 
"[[INSTALLED_DIR]]/../target/hexagon/include"



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mai

[llvm-branch-commits] [clang] b9be17a - [clang] fix crash on template instantiation of invalid requires expressions

2021-08-16 Thread Tom Stellard via llvm-branch-commits

Author: Matheus Izvekov
Date: 2021-08-16T20:36:52-07:00
New Revision: b9be17a7ecf960864490349401ca0d2dbba27c8b

URL: 
https://github.com/llvm/llvm-project/commit/b9be17a7ecf960864490349401ca0d2dbba27c8b
DIFF: 
https://github.com/llvm/llvm-project/commit/b9be17a7ecf960864490349401ca0d2dbba27c8b.diff

LOG: [clang] fix crash on template instantiation of invalid requires expressions

See PR48656.

The implementation of the template instantiation of requires expressions
was incorrectly trying to get the expression from an 'ExprRequirement'
before checking if it was an error state.

Signed-off-by: Matheus Izvekov 

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D107399

(cherry picked from commit e64e6924b8aef8d48117beb6e87162109ac2512c)

Added: 


Modified: 
clang/lib/Sema/SemaTemplateInstantiate.cpp
clang/test/CXX/expr/expr.prim/expr.prim.req/type-requirement.cpp
clang/test/CXX/temp/temp.constr/temp.constr.normal/p1.cpp

Removed: 




diff  --git a/clang/lib/Sema/SemaTemplateInstantiate.cpp 
b/clang/lib/Sema/SemaTemplateInstantiate.cpp
index f18f77d3442a1..74889aa3ca888 100644
--- a/clang/lib/Sema/SemaTemplateInstantiate.cpp
+++ b/clang/lib/Sema/SemaTemplateInstantiate.cpp
@@ -1934,25 +1934,23 @@ 
TemplateInstantiator::TransformExprRequirement(concepts::ExprRequirement *Req) {
 return Req;
 
   Sema::SFINAETrap Trap(SemaRef);
-  TemplateDeductionInfo Info(Req->getExpr()->getBeginLoc());
 
   llvm::PointerUnion
   TransExpr;
   if (Req->isExprSubstitutionFailure())
 TransExpr = Req->getExprSubstitutionDiagnostic();
   else {
-Sema::InstantiatingTemplate ExprInst(SemaRef, 
Req->getExpr()->getBeginLoc(),
- Req, Info,
- Req->getExpr()->getSourceRange());
+Expr *E = Req->getExpr();
+TemplateDeductionInfo Info(E->getBeginLoc());
+Sema::InstantiatingTemplate ExprInst(SemaRef, E->getBeginLoc(), Req, Info,
+ E->getSourceRange());
 if (ExprInst.isInvalid())
   return nullptr;
-ExprResult TransExprRes = TransformExpr(Req->getExpr());
+ExprResult TransExprRes = TransformExpr(E);
 if (TransExprRes.isInvalid() || Trap.hasErrorOccurred())
-  TransExpr = createSubstDiag(SemaRef, Info,
-  [&] (llvm::raw_ostream& OS) {
-  Req->getExpr()->printPretty(OS, nullptr,
-  SemaRef.getPrintingPolicy());
-  });
+  TransExpr = createSubstDiag(SemaRef, Info, [&](llvm::raw_ostream &OS) {
+E->printPretty(OS, nullptr, SemaRef.getPrintingPolicy());
+  });
 else
   TransExpr = TransExprRes.get();
   }
@@ -1966,6 +1964,7 @@ 
TemplateInstantiator::TransformExprRequirement(concepts::ExprRequirement *Req) {
   else if (RetReq.isTypeConstraint()) {
 TemplateParameterList *OrigTPL =
 RetReq.getTypeConstraintTemplateParameterList();
+TemplateDeductionInfo Info(OrigTPL->getTemplateLoc());
 Sema::InstantiatingTemplate TPLInst(SemaRef, OrigTPL->getTemplateLoc(),
 Req, Info, OrigTPL->getSourceRange());
 if (TPLInst.isInvalid())

diff  --git a/clang/test/CXX/expr/expr.prim/expr.prim.req/type-requirement.cpp 
b/clang/test/CXX/expr/expr.prim/expr.prim.req/type-requirement.cpp
index 15cbe66378450..5433cfb21955d 100644
--- a/clang/test/CXX/expr/expr.prim/expr.prim.req/type-requirement.cpp
+++ b/clang/test/CXX/expr/expr.prim/expr.prim.req/type-requirement.cpp
@@ -192,3 +192,29 @@ namespace std_example {
   using c3 = C2_check; // expected-error{{constraints not satisfied 
for class template 'C2_check' [with T = std_example::has_inner]}}
   using c4 = C3_check; // expected-error{{constraints not satisfied for 
class template 'C3_check' [with T = void]}}
 }
+
+namespace PR48656 {
+
+template  concept C = requires { requires requires { T::a; }; };
+// expected-note@-1 {{because 'T::a' would be invalid: no member named 'a' in 
'PR48656::T1'}}
+
+template  struct A {};
+// expected-note@-1 {{because 'PR48656::T1' does not satisfy 'C'}}
+
+struct T1 {};
+template struct A; // expected-error {{constraints not satisfied for class 
template 'A' [with $0 = ]}}
+
+struct T2 { static constexpr bool a = false; };
+template struct A;
+
+template  struct T3 {
+  static void m(auto) requires requires { T::fail; } {}
+  // expected-note@-1 {{constraints not satisfied}}
+  // expected-note@-2 {{type 'int' cannot be used prior to '::'}}
+};
+template  void t3(Args... args) { (..., T3::m(args)); }
+// expected-error@-1 {{no matching function for call to 'm'}}
+
+template void t3(int); // expected-note {{requested here}}
+
+} // namespace PR48656

diff  --git a/clang/test/CXX/temp/temp.constr/temp.constr.normal/p1.cpp 
b/clang/test/CXX/temp/temp.constr/temp.constr.normal/p1.cpp
index 593902c6b74d9..d80710937cdfa 100644
--- a/clang/test/CXX

[llvm-branch-commits] [lld] 87d56ad - [LLD] [MinGW] Add more options for disabling flags in the executable

2021-08-16 Thread Tom Stellard via llvm-branch-commits

Author: Martin Storsjö
Date: 2021-08-16T20:39:35-07:00
New Revision: 87d56ad4411d96ee1604939409cf6e215fb1ff2e

URL: 
https://github.com/llvm/llvm-project/commit/87d56ad4411d96ee1604939409cf6e215fb1ff2e
DIFF: 
https://github.com/llvm/llvm-project/commit/87d56ad4411d96ee1604939409cf6e215fb1ff2e.diff

LOG: [LLD] [MinGW] Add more options for disabling flags in the executable

In e72403f96de7f1c681acd5968f72aa986412dfce, we added the flag
"--no-dynamicbase" for disabling the dynamicbase flag which we set
by default. At the time, ld.bfd didn't have any corresponding
option (as ld.bfd defaulted to not setting the flag). Almost at
the same time, corresponding options were added to ld.bfd for
disabling it (while it was being enabled by default), with a
different name, "--disable-dynamicbase".

Thus add the "--disable-dynamicbase" option. Make this default
one advertised in the help listing, but keep the "--no-dynamicbase"
form as an alias. Also improve checking for the last option set
if there are multiple ones on the same command line.

Also add corresponding disable options for a lot of other flags
that we set by default, also added in ld.bfd in the same commit:
https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=514b4e191d5f46de8e142fe216e677a35fa9c4bb

Differential Revision: https://reviews.llvm.org/D107930

(cherry picked from commit f8340c8c5de6273c2f000a8cef7e1de056b34332)

Added: 


Modified: 
lld/MinGW/Driver.cpp
lld/MinGW/Options.td
lld/test/MinGW/driver.test

Removed: 




diff  --git a/lld/MinGW/Driver.cpp b/lld/MinGW/Driver.cpp
index 4a3a9ef9be030..7c6b865a2e398 100644
--- a/lld/MinGW/Driver.cpp
+++ b/lld/MinGW/Driver.cpp
@@ -308,13 +308,19 @@ bool mingw::link(ArrayRef argsArr, bool 
canExitEarly,
 add("-kill-at");
   if (args.hasArg(OPT_appcontainer))
 add("-appcontainer");
-  if (args.hasArg(OPT_no_seh))
+  if (args.hasFlag(OPT_no_seh, OPT_disable_no_seh, false))
 add("-noseh");
 
   if (args.getLastArgValue(OPT_m) != "thumb2pe" &&
   args.getLastArgValue(OPT_m) != "arm64pe" &&
-  args.hasArg(OPT_no_dynamicbase))
+  args.hasFlag(OPT_disable_dynamicbase, OPT_dynamicbase, false))
 add("-dynamicbase:no");
+  if (args.hasFlag(OPT_disable_high_entropy_va, OPT_high_entropy_va, false))
+add("-highentropyva:no");
+  if (args.hasFlag(OPT_disable_nxcompat, OPT_nxcompat, false))
+add("-nxcompat:no");
+  if (args.hasFlag(OPT_disable_tsaware, OPT_tsaware, false))
+add("-tsaware:no");
 
   if (args.hasFlag(OPT_no_insert_timestamp, OPT_insert_timestamp, false))
 add("-timestamp:0");

diff  --git a/lld/MinGW/Options.td b/lld/MinGW/Options.td
index fb178a1438e55..50ac71bced850 100644
--- a/lld/MinGW/Options.td
+++ b/lld/MinGW/Options.td
@@ -26,6 +26,11 @@ multiclass B {
   def no_ # NAME: Flag<["--", "-"], "no-" # name>, HelpText;
 }
 
+multiclass B_disable {
+  def NAME: Flag<["--", "-"], name>, HelpText;
+  def disable_ # NAME: Flag<["--", "-"], "disable-" # name>, HelpText;
+}
+
 def L: JoinedOrSeparate<["-"], "L">, MetaVarName<"">,
   HelpText<"Add a directory to the library search path">;
 defm allow_multiple_definition: B<"allow-multiple-definition",
@@ -42,7 +47,7 @@ def disable_runtime_pseudo_reloc: 
F<"disable-runtime-pseudo-reloc">,
 HelpText<"Don't do automatic imports that require runtime fixups">;
 def disable_stdcall_fixup: F<"disable-stdcall-fixup">,
 HelpText<"Don't resolve stdcall/fastcall/vectorcall to undecorated 
symbols">;
-defm dynamicbase: B<"dynamicbase", "Enable ASLR", "Disable ASLR">;
+defm dynamicbase: B_disable<"dynamicbase", "Enable ASLR", "Disable ASLR">;
 def enable_auto_import: F<"enable-auto-import">,
 HelpText<"Automatically import data symbols from other DLLs where needed">;
 def enable_runtime_pseudo_reloc: F<"enable-runtime-pseudo-reloc">,
@@ -62,6 +67,8 @@ defm gc_sections: B<"gc-sections",
 "Remove unused sections",
 "Don't remove unused sections">;
 def help: F<"help">, HelpText<"Print option help">;
+defm high_entropy_va: B_disable<"high-entropy-va",
+"Set the 'high entropy VA' flag", "Don't set the 'high entropy VA' flag">;
 defm icf: Eq<"icf", "Identical code folding">;
 defm image_base: Eq<"image-base", "Base address of the program">;
 defm insert_timestamp: B<"insert-timestamp",
@@ -80,7 +87,10 @@ defm minor_os_version: EqLong<"minor-os-version",
  "Set the OS and subsystem minor version">;
 defm minor_subsystem_version: EqLong<"minor-subsystem-version",
  "Set the OS and subsystem minor version">;
-def no_seh: F<"no-seh">, HelpText<"Set the 'no SEH' flag in the executable">;
+defm no_seh: B_disable<"no-seh",
+ "Set the 'no SEH' flag in the executable", "Don't set the 'no SEH' flag">;
+defm nxcompat: B_disable<"nxcompat",
+"Set the 'nxcompat' flag in the executable", "Don't set the 'nxcompat' 
flag">;
 def large_address_aware: Flag<["--"], "large-address-aware">,
 HelpText<"Ena

[llvm-branch-commits] [llvm] 2153cad - [DAGCombiner] Stop visitEXTRACT_SUBVECTOR creating illegal BITCASTs post legalisation.

2021-08-16 Thread Tom Stellard via llvm-branch-commits

Author: Paul Walker
Date: 2021-08-16T23:26:32-07:00
New Revision: 2153cad11ba252698c21d48723265ca7f4850a29

URL: 
https://github.com/llvm/llvm-project/commit/2153cad11ba252698c21d48723265ca7f4850a29
DIFF: 
https://github.com/llvm/llvm-project/commit/2153cad11ba252698c21d48723265ca7f4850a29.diff

LOG: [DAGCombiner] Stop visitEXTRACT_SUBVECTOR creating illegal BITCASTs post 
legalisation.

visitEXTRACT_SUBVECTOR can sometimes create illegal BITCASTs when
removing "redundant" INSERT_SUBVECTOR operations.  This patch adds
an extra check to ensure such combines only occur after operation
legalisation if any resulting BITBAST is itself legal.

Differential Revision: https://reviews.llvm.org/D108086

(cherry picked from commit cd0e1964137f1cd7b508809ec80c7d9dcb3f0458)

Added: 


Modified: 
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
llvm/test/CodeGen/AArch64/sve-fixed-length-masked-scatter.ll

Removed: 




diff  --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp 
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 1bba7232eb14d..4f730b2cf372d 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -20560,8 +20560,12 @@ SDValue DAGCombiner::visitEXTRACT_SUBVECTOR(SDNode *N) 
{
 //otherwise => (extract_subvec V1, ExtIdx)
 uint64_t InsIdx = V.getConstantOperandVal(2);
 if (InsIdx * SmallVT.getScalarSizeInBits() ==
-ExtIdx * NVT.getScalarSizeInBits())
+ExtIdx * NVT.getScalarSizeInBits()) {
+  if (LegalOperations && !TLI.isOperationLegal(ISD::BITCAST, NVT))
+return SDValue();
+
   return DAG.getBitcast(NVT, V.getOperand(1));
+}
 return DAG.getNode(
 ISD::EXTRACT_SUBVECTOR, SDLoc(N), NVT,
 DAG.getBitcast(N->getOperand(0).getValueType(), V.getOperand(0)),

diff  --git a/llvm/test/CodeGen/AArch64/sve-fixed-length-masked-scatter.ll 
b/llvm/test/CodeGen/AArch64/sve-fixed-length-masked-scatter.ll
index ec0f4bdf025c6..2ce98f00687df 100644
--- a/llvm/test/CodeGen/AArch64/sve-fixed-length-masked-scatter.ll
+++ b/llvm/test/CodeGen/AArch64/sve-fixed-length-masked-scatter.ll
@@ -1057,6 +1057,37 @@ define void @masked_scatter_vec_plus_imm(<32 x float>* 
%a, <32 x i8*>* %b) #0 {
   ret void
 }
 
+; extract_subvec(...(insert_subvec(a,b,c))) -> extract_subvec(bitcast(b),d) 
like
+; combines can effectively unlegalise bitcast operations. This test ensures 
such
+; combines do not happen after operation legalisation. When not prevented the
+; test triggers infinite combine->legalise->combine->...
+;
+; NOTE: For this test to function correctly it's critical for %vals to be in a
+; 
diff erent block to the scatter store.  If not, the problematic bitcast will be
+; removed before operation legalisation and thus not exercise the combine.
+define void @masked_scatter_bitcast_infinite_loop(<8 x double>* %a, <8 x 
double*>* %b, i1 %cond) #0 {
+; CHECK-LABEL: masked_scatter_bitcast_infinite_loop
+; VBITS_GE_512: ptrue [[PG0:p[0-9]+]].d, vl8
+; VBITS_GE_512-NEXT: ld1d { [[VALS:z[0-9]+]].d }, [[PG0]]/z, [x0]
+; VBITS_GE_512-NEXT: tbz w2, #0, [[LABEL:.*]]
+; VBITS_GE_512-NEXT: ld1d { [[PTRS:z[0-9]+]].d }, [[PG0]]/z, [x1]
+; VBITS_GE_512-NEXT: fcmeq [[MASK:p[0-9]+]].d, [[PG0]]/z, [[VALS]].d, #0.0
+; VBITS_GE_512-NEXT: st1d { [[VALS]].d }, [[MASK]], {{\[}}[[PTRS]].d]
+; VBITS_GE_512-NEXT: [[LABEL]]:
+; VBITS_GE_512-NEXT: ret
+  %vals = load volatile <8 x double>, <8 x double>* %a
+  br i1 %cond, label %bb.1, label %bb.2
+
+bb.1:
+  %ptrs = load <8 x double*>, <8 x double*>* %b
+  %mask = fcmp oeq <8 x double> %vals, zeroinitializer
+  call void @llvm.masked.scatter.v8f64(<8 x double> %vals, <8 x double*> 
%ptrs, i32 8, <8 x i1> %mask)
+  br label %bb.2
+
+bb.2:
+  ret void
+}
+
 declare void @llvm.masked.scatter.v2i8(<2 x i8>, <2 x i8*>, i32, <2 x i1>)
 declare void @llvm.masked.scatter.v4i8(<4 x i8>, <4 x i8*>, i32, <4 x i1>)
 declare void @llvm.masked.scatter.v8i8(<8 x i8>, <8 x i8*>, i32, <8 x i1>)



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits