[llvm-branch-commits] [llvm] [LAA] Use PSE::getSymbolicMaxBackedgeTakenCount. (PR #93499)

2024-05-28 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn closed https://github.com/llvm/llvm-project/pull/93499
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LAA] Use PSE::getSymbolicMaxBackedgeTakenCount. (PR #93499)

2024-05-28 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn reopened 
https://github.com/llvm/llvm-project/pull/93499
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LAA] Use SCEVUse to add extra NUW flags to pointer bounds. (WIP) (PR #91962)

2024-06-25 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/91962

>From 9a8305b0041586627b3c3c8a1dc954306767cadc Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Wed, 1 May 2024 11:03:42 +0100
Subject: [PATCH 1/3] [SCEV,LAA] Add tests to make sure scoped SCEVs don't
 impact other SCEVs.

---
 .../LoopAccessAnalysis/scoped-scevs.ll| 182 ++
 1 file changed, 182 insertions(+)
 create mode 100644 llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll

diff --git a/llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll 
b/llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll
new file mode 100644
index 0..323ba2a739cf8
--- /dev/null
+++ b/llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll
@@ -0,0 +1,182 @@
+; NOTE: Assertions have been autogenerated by 
utils/update_analyze_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -passes='print,print' 
-disable-output %s 2>&1 | FileCheck --check-prefixes=LAA,AFTER %s
+; RUN: opt 
-passes='print,print,print' 
-disable-output %s 2>&1 | FileCheck --check-prefixes=BEFORE,LAA,AFTER %s
+
+target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
+
+declare void @use(ptr)
+
+; Check that scoped expressions created by LAA do not interfere with non-scoped
+; SCEVs with the same operands. The tests first run print to
+; populate the SCEV cache. They contain a GEP computing A+405, which is the end
+; of the accessed range, before and/or after the loop. No nuw flags should be
+; added to them in the second print output.
+
+define ptr @test_ptr_range_end_computed_before_and_after_loop(ptr %A) {
+; BEFORE-LABEL: 'test_ptr_range_end_computed_before_and_after_loop'
+; BEFORE-NEXT:  Classifying expressions for: 
@test_ptr_range_end_computed_before_and_after_loop
+; BEFORE:%x = getelementptr inbounds i8, ptr %A, i64 405
+; BEFORE-NEXT:--> (405 + %A) U: full-set S: full-set
+; BEFORE:%y = getelementptr inbounds i8, ptr %A, i64 405
+; BEFORE-NEXT:--> (405 + %A) U: full-set S: full-set
+;
+; LAA-LABEL: 'test_ptr_range_end_computed_before_and_after_loop'
+; LAA-NEXT:loop:
+; LAA-NEXT:  Memory dependences are safe with run-time checks
+; LAA-NEXT:  Dependences:
+; LAA-NEXT:  Run-time memory checks:
+; LAA-NEXT:  Check 0:
+; LAA-NEXT:Comparing group ([[GRP1:0x[0-9a-f]+]]):
+; LAA-NEXT:  %gep.A.400 = getelementptr inbounds i32, ptr %A.1, i64 %iv
+; LAA-NEXT:Against group ([[GRP2:0x[0-9a-f]+]]):
+; LAA-NEXT:  %gep.A = getelementptr inbounds i8, ptr %A, i64 %iv
+; LAA-NEXT:  Grouped accesses:
+; LAA-NEXT:Group [[GRP1]]:
+; LAA-NEXT:  (Low: (1 + %A) High: (405 + %A))
+; LAA-NEXT:Member: {(1 + %A),+,4}<%loop>
+; LAA-NEXT:Group [[GRP2]]:
+; LAA-NEXT:  (Low: %A High: (101 + %A))
+; LAA-NEXT:Member: {%A,+,1}<%loop>
+; LAA-EMPTY:
+; LAA-NEXT:  Non vectorizable stores to invariant address were not found 
in loop.
+; LAA-NEXT:  SCEV assumptions:
+; LAA-EMPTY:
+; LAA-NEXT:  Expressions re-written:
+;
+; AFTER-LABEL: 'test_ptr_range_end_computed_before_and_after_loop'
+; AFTER-NEXT:  Classifying expressions for: 
@test_ptr_range_end_computed_before_and_after_loop
+; AFTER:%x = getelementptr inbounds i8, ptr %A, i64 405
+; AFTER-NEXT:--> (405 + %A) U: full-set S: full-set
+; AFTER:%y = getelementptr inbounds i8, ptr %A, i64 405
+; AFTER-NEXT:--> (405 + %A) U: full-set S: full-set
+entry:
+  %A.1 = getelementptr inbounds i8, ptr %A, i64 1
+  %x = getelementptr inbounds i8, ptr %A, i64 405
+  call void @use(ptr %x)
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
+  %gep.A.400 = getelementptr inbounds i32, ptr %A.1, i64 %iv
+  %gep.A = getelementptr inbounds i8, ptr %A, i64 %iv
+  %l = load i8, ptr %gep.A, align 1
+  %ext = zext i8 %l to i32
+  store i32 %ext, ptr %gep.A.400, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %ec = icmp eq i64 %iv, 100
+  br i1 %ec, label %exit, label %loop
+
+exit:
+  %y = getelementptr inbounds i8, ptr %A, i64 405
+  ret ptr %y
+}
+
+define void @test_ptr_range_end_computed_before_loop(ptr %A) {
+; BEFORE-LABEL: 'test_ptr_range_end_computed_before_loop'
+; BEFORE-NEXT:  Classifying expressions for: 
@test_ptr_range_end_computed_before_loop
+; BEFORE-NEXT:%A.1 = getelementptr inbounds i8, ptr %A, i64 1
+; BEFORE-NEXT:--> (1 + %A) U: full-set S: full-set
+; BEFORE-NEXT:%x = getelementptr inbounds i8, ptr %A, i64 405
+;
+; LAA-LABEL: 'test_ptr_range_end_computed_before_loop'
+; LAA-NEXT:loop:
+; LAA-NEXT:  Memory dependences are safe with run-time checks
+; LAA-NEXT:  Dependences:
+; LAA-NEXT:  Run-time memory checks:
+; LAA-NEXT:  Check 0:
+; LAA-NEXT:Comparing group ([[GRP3:0x[0-9a-f]+]]):
+; LAA-NEXT:  %gep.A.400 = getelementptr inbounds i32, ptr %A.1, i64 %iv
+; LAA-NEXT:Against group ([[GRP4:0x[0-9a-f]+]]):
+; LAA-NEXT:  %gep.A = getelementptr inbounds i8, ptr 

[llvm-branch-commits] [compiler-rt] [TySan] Fixed false positive when accessing offset member variables (PR #95387)

2024-06-27 Thread Florian Hahn via llvm-branch-commits


@@ -221,7 +221,17 @@ __tysan_check(void *addr, int size, tysan_type_descriptor 
*td, int flags) {
 OldTDPtr -= i;
 OldTD = *OldTDPtr;
 
-if (!isAliasingLegal(td, OldTD))
+tysan_type_descriptor *InternalMember = OldTD;

fhahn wrote:

Could you add a comment here indicating what this does?

https://github.com/llvm/llvm-project/pull/95387
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [TySan] A Type Sanitizer (Clang) (PR #76260)

2024-06-27 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/76260

>From f45d4dc65537f3664472c873062fbda2a9bed984 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Thu, 18 Apr 2024 23:01:03 +0100
Subject: [PATCH 1/2] [TySan] A Type Sanitizer (Clang)

---
 clang/include/clang/Basic/Features.def |  1 +
 clang/include/clang/Basic/Sanitizers.def   |  3 ++
 clang/include/clang/Driver/SanitizerArgs.h |  1 +
 clang/lib/CodeGen/BackendUtil.cpp  |  6 +++
 clang/lib/CodeGen/CGDecl.cpp   |  3 +-
 clang/lib/CodeGen/CGDeclCXX.cpp|  4 ++
 clang/lib/CodeGen/CodeGenFunction.cpp  |  2 +
 clang/lib/CodeGen/CodeGenModule.cpp| 12 +++---
 clang/lib/CodeGen/CodeGenTBAA.cpp  |  6 ++-
 clang/lib/CodeGen/SanitizerMetadata.cpp| 44 +-
 clang/lib/CodeGen/SanitizerMetadata.h  | 13 ---
 clang/lib/Driver/SanitizerArgs.cpp | 13 +--
 clang/lib/Driver/ToolChains/CommonArgs.cpp |  6 ++-
 clang/lib/Driver/ToolChains/Darwin.cpp |  6 +++
 clang/lib/Driver/ToolChains/Linux.cpp  |  2 +
 clang/test/Driver/sanitizer-ld.c   | 23 +++
 16 files changed, 116 insertions(+), 29 deletions(-)

diff --git a/clang/include/clang/Basic/Features.def 
b/clang/include/clang/Basic/Features.def
index 53f410d3cb4bd..6a9921ffee884 100644
--- a/clang/include/clang/Basic/Features.def
+++ b/clang/include/clang/Basic/Features.def
@@ -100,6 +100,7 @@ FEATURE(numerical_stability_sanitizer, 
LangOpts.Sanitize.has(SanitizerKind::Nume
 FEATURE(memory_sanitizer,
 LangOpts.Sanitize.hasOneOf(SanitizerKind::Memory |
SanitizerKind::KernelMemory))
+FEATURE(type_sanitizer, LangOpts.Sanitize.has(SanitizerKind::Type))
 FEATURE(thread_sanitizer, LangOpts.Sanitize.has(SanitizerKind::Thread))
 FEATURE(dataflow_sanitizer, LangOpts.Sanitize.has(SanitizerKind::DataFlow))
 FEATURE(scudo, LangOpts.Sanitize.hasOneOf(SanitizerKind::Scudo))
diff --git a/clang/include/clang/Basic/Sanitizers.def 
b/clang/include/clang/Basic/Sanitizers.def
index bee35e9dca7c3..4b59b43437c2c 100644
--- a/clang/include/clang/Basic/Sanitizers.def
+++ b/clang/include/clang/Basic/Sanitizers.def
@@ -73,6 +73,9 @@ SANITIZER("fuzzer", Fuzzer)
 // libFuzzer-required instrumentation, no linking.
 SANITIZER("fuzzer-no-link", FuzzerNoLink)
 
+// TypeSanitizer
+SANITIZER("type", Type)
+
 // ThreadSanitizer
 SANITIZER("thread", Thread)
 
diff --git a/clang/include/clang/Driver/SanitizerArgs.h 
b/clang/include/clang/Driver/SanitizerArgs.h
index 47ef175302679..fde2ea3eac8ea 100644
--- a/clang/include/clang/Driver/SanitizerArgs.h
+++ b/clang/include/clang/Driver/SanitizerArgs.h
@@ -86,6 +86,7 @@ class SanitizerArgs {
   bool needsHwasanAliasesRt() const {
 return needsHwasanRt() && HwasanUseAliases;
   }
+  bool needsTysanRt() const { return Sanitizers.has(SanitizerKind::Type); }
   bool needsTsanRt() const { return Sanitizers.has(SanitizerKind::Thread); }
   bool needsMsanRt() const { return Sanitizers.has(SanitizerKind::Memory); }
   bool needsFuzzer() const { return Sanitizers.has(SanitizerKind::Fuzzer); }
diff --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index b09680086248d..ff7cc5a8e48ba 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -80,6 +80,7 @@
 #include "llvm/Transforms/Instrumentation/SanitizerBinaryMetadata.h"
 #include "llvm/Transforms/Instrumentation/SanitizerCoverage.h"
 #include "llvm/Transforms/Instrumentation/ThreadSanitizer.h"
+#include "llvm/Transforms/Instrumentation/TypeSanitizer.h"
 #include "llvm/Transforms/ObjCARC.h"
 #include "llvm/Transforms/Scalar/EarlyCSE.h"
 #include "llvm/Transforms/Scalar/GVN.h"
@@ -707,6 +708,11 @@ static void addSanitizers(const Triple &TargetTriple,
   MPM.addPass(createModuleToFunctionPassAdaptor(ThreadSanitizerPass()));
 }
 
+if (LangOpts.Sanitize.has(SanitizerKind::Type)) {
+  MPM.addPass(ModuleTypeSanitizerPass());
+  MPM.addPass(createModuleToFunctionPassAdaptor(TypeSanitizerPass()));
+}
+
 auto ASanPass = [&](SanitizerMask Mask, bool CompileKernel) {
   if (LangOpts.Sanitize.has(Mask)) {
 bool UseGlobalGC = asanUseGlobalsGC(TargetTriple, CodeGenOpts);
diff --git a/clang/lib/CodeGen/CGDecl.cpp b/clang/lib/CodeGen/CGDecl.cpp
index 90aa4c0745a8a..4933f0c95fa8a 100644
--- a/clang/lib/CodeGen/CGDecl.cpp
+++ b/clang/lib/CodeGen/CGDecl.cpp
@@ -484,7 +484,8 @@ void CodeGenFunction::EmitStaticVarDecl(const VarDecl &D,
   LocalDeclMap.find(&D)->second = Address(castedAddr, elemTy, alignment);
   CGM.setStaticLocalDeclAddress(&D, castedAddr);
 
-  CGM.getSanitizerMetadata()->reportGlobal(var, D);
+  CGM.getSanitizerMetadata()->reportGlobalToASan(var, D);
+  CGM.getSanitizerMetadata()->reportGlobalToTySan(var, D);
 
   // Emit global variable debug descriptor for static vars.
   CGDebugInfo *DI = getDebugInfo();
diff --git a/clang/lib/CodeGen/CGDeclCXX.cpp b/clang/lib/CodeGen/CGDeclCXX

[llvm-branch-commits] [clang] [compiler-rt] [TySan] A Type Sanitizer (Runtime Library) (PR #76261)

2024-06-27 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/76261

>From 733b3ed3f7441453889157834e0a5b6c288bf976 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Thu, 27 Jun 2024 15:48:05 +0100
Subject: [PATCH] [tysan] Add runtime support

---
 clang/runtime/CMakeLists.txt  |   2 +-
 .../cmake/Modules/AllSupportedArchDefs.cmake  |   1 +
 compiler-rt/cmake/config-ix.cmake |  14 +-
 compiler-rt/lib/tysan/CMakeLists.txt  |  64 
 compiler-rt/lib/tysan/lit.cfg |  35 ++
 compiler-rt/lib/tysan/lit.site.cfg.in |  12 +
 compiler-rt/lib/tysan/tysan.cpp   | 339 ++
 compiler-rt/lib/tysan/tysan.h |  79 
 compiler-rt/lib/tysan/tysan.syms.extra|   2 +
 compiler-rt/lib/tysan/tysan_flags.inc |  17 +
 compiler-rt/lib/tysan/tysan_interceptors.cpp  | 250 +
 compiler-rt/lib/tysan/tysan_platform.h|  93 +
 compiler-rt/test/tysan/CMakeLists.txt |  32 ++
 compiler-rt/test/tysan/anon-ns.cpp|  41 +++
 compiler-rt/test/tysan/anon-same-struct.c |  26 ++
 compiler-rt/test/tysan/anon-struct.c  |  27 ++
 compiler-rt/test/tysan/basic.c|  65 
 compiler-rt/test/tysan/char-memcpy.c  |  45 +++
 compiler-rt/test/tysan/global.c   |  31 ++
 compiler-rt/test/tysan/int-long.c |  21 ++
 compiler-rt/test/tysan/lit.cfg.py | 139 +++
 compiler-rt/test/tysan/lit.site.cfg.py.in |  17 +
 compiler-rt/test/tysan/ptr-float.c|  19 +
 ...ruct-offset-multiple-compilation-units.cpp |  51 +++
 compiler-rt/test/tysan/struct-offset.c|  26 ++
 compiler-rt/test/tysan/struct.c   |  39 ++
 compiler-rt/test/tysan/union-wr-wr.c  |  18 +
 compiler-rt/test/tysan/violation-pr45282.c|  32 ++
 compiler-rt/test/tysan/violation-pr47137.c|  40 +++
 compiler-rt/test/tysan/violation-pr51837.c|  34 ++
 compiler-rt/test/tysan/violation-pr62544.c|  24 ++
 compiler-rt/test/tysan/violation-pr62828.cpp  |  44 +++
 compiler-rt/test/tysan/violation-pr68655.cpp  |  40 +++
 compiler-rt/test/tysan/violation-pr86685.c|  29 ++
 34 files changed, 1746 insertions(+), 2 deletions(-)
 create mode 100644 compiler-rt/lib/tysan/CMakeLists.txt
 create mode 100644 compiler-rt/lib/tysan/lit.cfg
 create mode 100644 compiler-rt/lib/tysan/lit.site.cfg.in
 create mode 100644 compiler-rt/lib/tysan/tysan.cpp
 create mode 100644 compiler-rt/lib/tysan/tysan.h
 create mode 100644 compiler-rt/lib/tysan/tysan.syms.extra
 create mode 100644 compiler-rt/lib/tysan/tysan_flags.inc
 create mode 100644 compiler-rt/lib/tysan/tysan_interceptors.cpp
 create mode 100644 compiler-rt/lib/tysan/tysan_platform.h
 create mode 100644 compiler-rt/test/tysan/CMakeLists.txt
 create mode 100644 compiler-rt/test/tysan/anon-ns.cpp
 create mode 100644 compiler-rt/test/tysan/anon-same-struct.c
 create mode 100644 compiler-rt/test/tysan/anon-struct.c
 create mode 100644 compiler-rt/test/tysan/basic.c
 create mode 100644 compiler-rt/test/tysan/char-memcpy.c
 create mode 100644 compiler-rt/test/tysan/global.c
 create mode 100644 compiler-rt/test/tysan/int-long.c
 create mode 100644 compiler-rt/test/tysan/lit.cfg.py
 create mode 100644 compiler-rt/test/tysan/lit.site.cfg.py.in
 create mode 100644 compiler-rt/test/tysan/ptr-float.c
 create mode 100644 
compiler-rt/test/tysan/struct-offset-multiple-compilation-units.cpp
 create mode 100644 compiler-rt/test/tysan/struct-offset.c
 create mode 100644 compiler-rt/test/tysan/struct.c
 create mode 100644 compiler-rt/test/tysan/union-wr-wr.c
 create mode 100644 compiler-rt/test/tysan/violation-pr45282.c
 create mode 100644 compiler-rt/test/tysan/violation-pr47137.c
 create mode 100644 compiler-rt/test/tysan/violation-pr51837.c
 create mode 100644 compiler-rt/test/tysan/violation-pr62544.c
 create mode 100644 compiler-rt/test/tysan/violation-pr62828.cpp
 create mode 100644 compiler-rt/test/tysan/violation-pr68655.cpp
 create mode 100644 compiler-rt/test/tysan/violation-pr86685.c

diff --git a/clang/runtime/CMakeLists.txt b/clang/runtime/CMakeLists.txt
index 65fcdc2868f03..ff2605b23d25b 100644
--- a/clang/runtime/CMakeLists.txt
+++ b/clang/runtime/CMakeLists.txt
@@ -122,7 +122,7 @@ if(LLVM_BUILD_EXTERNAL_COMPILER_RT AND EXISTS 
${COMPILER_RT_SRC_ROOT}/)
COMPONENT compiler-rt)
 
   # Add top-level targets that build specific compiler-rt runtimes.
-  set(COMPILER_RT_RUNTIMES fuzzer asan builtins dfsan lsan msan profile tsan 
ubsan ubsan-minimal)
+  set(COMPILER_RT_RUNTIMES fuzzer asan builtins dfsan lsan msan profile tsan 
tysan ubsan ubsan-minimal)
   foreach(runtime ${COMPILER_RT_RUNTIMES})
 get_ext_project_build_command(build_runtime_cmd ${runtime})
 add_custom_target(${runtime}
diff --git a/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake 
b/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
index ac4a71202384d..4701b58de4bda 1006

[llvm-branch-commits] [llvm] [LV] Disable VPlan-based cost model for 19.x release. (PR #100097)

2024-07-23 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn milestoned 
https://github.com/llvm/llvm-project/pull/100097
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Disable VPlan-based cost model for 19.x release. (PR #100097)

2024-07-23 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn created 
https://github.com/llvm/llvm-project/pull/100097

As discussed in  https://github.com/llvm/llvm-project/pull/92555 flip the 
default for the option added in
https://github.com/llvm/llvm-project/pull/99536 to true.

This restores the original behavior for the release branch to give the 
VPlan-based cost model more time to mature on main.

>From a72a0bf44a8b259be3c62e79082d2fdc04fc2771 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Tue, 23 Jul 2024 11:15:26 +0100
Subject: [PATCH] [LV] Disable VPlan-based cost model for 19.x release.

As discussed in  https://github.com/llvm/llvm-project/pull/92555 flip
the default for the option added in
https://github.com/llvm/llvm-project/pull/99536 to true.

This restores the original behavior for the release branch to give the
VPlan-based cost model more time to mature on main.
---
 llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | 2 +-
 .../test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll | 2 --
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 6d28b8fabe42e..68363abdb817a 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -206,7 +206,7 @@ static cl::opt VectorizeMemoryCheckThreshold(
 cl::desc("The maximum allowed number of runtime memory checks"));
 
 static cl::opt UseLegacyCostModel(
-"vectorize-use-legacy-cost-model", cl::init(false), cl::Hidden,
+"vectorize-use-legacy-cost-model", cl::init(true), cl::Hidden,
 cl::desc("Use the legacy cost model instead of the VPlan-based cost model. 
"
  "This option will be removed in the future."));
 
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll 
b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
index fc310f4163082..1a78eaf644723 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
@@ -135,7 +135,6 @@ define void @vector_reverse_i64(ptr nocapture noundef 
writeonly %A, ptr nocaptur
 ; CHECK-NEXT:  LV: Interleaving is not beneficial.
 ; CHECK-NEXT:  LV: Found a vectorizable loop (vscale x 4) in 
 ; CHECK-NEXT:  LEV: Epilogue vectorization is not profitable for this loop
-; CHECK-NEXT:  VF picked by VPlan cost model: vscale x 4
 ; CHECK-NEXT:  Executing best plan with VF=vscale x 4, UF=1
 ; CHECK-NEXT:  VPlan 'Final VPlan for VF={vscale x 4},UF>=1' {
 ; CHECK-NEXT:  Live-in vp<%0> = VF * UF
@@ -339,7 +338,6 @@ define void @vector_reverse_f32(ptr nocapture noundef 
writeonly %A, ptr nocaptur
 ; CHECK-NEXT:  LV: Interleaving is not beneficial.
 ; CHECK-NEXT:  LV: Found a vectorizable loop (vscale x 4) in 
 ; CHECK-NEXT:  LEV: Epilogue vectorization is not profitable for this loop
-; CHECK-NEXT:  VF picked by VPlan cost model: vscale x 4
 ; CHECK-NEXT:  Executing best plan with VF=vscale x 4, UF=1
 ; CHECK-NEXT:  VPlan 'Final VPlan for VF={vscale x 4},UF>=1' {
 ; CHECK-NEXT:  Live-in vp<%0> = VF * UF

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Disable VPlan-based cost model for 19.x release. (PR #100097)

2024-07-23 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/100097
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LAA] Refine stride checks for SCEVs during dependence analysis. (#99… (PR #102201)

2024-08-06 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn milestoned 
https://github.com/llvm/llvm-project/pull/102201
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LAA] Refine stride checks for SCEVs during dependence analysis. (#99… (PR #102201)

2024-08-06 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn created 
https://github.com/llvm/llvm-project/pull/102201

…577)

Update getDependenceDistanceStrideAndSize to reason about different 
combinations of strides directly and explicitly.

Update getPtrStride to return 0 for invariant pointers.

Then proceed by checking the strides.

If either source or sink are not strided by a constant (i.e. not a non-wrapping 
AddRec) or invariant, the accesses may overlap with earlier or later iterations 
and we cannot generate runtime checks to disambiguate them.

Otherwise they are either loop invariant or strided. In that case, we can 
generate a runtime check to disambiguate them.

If both are strided by constants, we proceed as previously.

This is an alternative to
https://github.com/llvm/llvm-project/pull/99239 and also replaces additional 
checks if the underlying object is loop-invariant.

Fixes https://github.com/llvm/llvm-project/issues/87189.

PR: https://github.com/llvm/llvm-project/pull/99577

>From 83098f4513567a054663b30380e4f2039ee8a6d0 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Fri, 26 Jul 2024 13:10:16 +0100
Subject: [PATCH] [LAA] Refine stride checks for SCEVs during dependence
 analysis. (#99577)

Update getDependenceDistanceStrideAndSize to reason about different
combinations of strides directly and explicitly.

Update getPtrStride to return 0 for invariant pointers.

Then proceed by checking the strides.

If either source or sink are not strided by a constant (i.e. not a
non-wrapping AddRec) or invariant, the accesses may overlap
with earlier or later iterations and we cannot generate runtime
checks to disambiguate them.

Otherwise they are either loop invariant or strided. In that case, we
can generate a runtime check to disambiguate them.

If both are strided by constants, we proceed as previously.

This is an alternative to
https://github.com/llvm/llvm-project/pull/99239 and also replaces
additional checks if the underlying object is loop-invariant.

Fixes https://github.com/llvm/llvm-project/issues/87189.

PR: https://github.com/llvm/llvm-project/pull/99577
---
 .../llvm/Analysis/LoopAccessAnalysis.h|  23 ++--
 llvm/lib/Analysis/LoopAccessAnalysis.cpp  | 121 --
 .../load-store-index-loaded-in-loop.ll|  26 ++--
 .../pointer-with-unknown-bounds.ll|   4 +-
 .../LoopAccessAnalysis/print-order.ll |   6 +-
 .../LoopAccessAnalysis/select-dependence.ll   |   4 +-
 .../LoopAccessAnalysis/symbolic-stride.ll |   4 +-
 7 files changed, 87 insertions(+), 101 deletions(-)

diff --git a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h 
b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
index afafb74bdcb0ac..95a74b91f7acbf 100644
--- a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
+++ b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
@@ -199,9 +199,8 @@ class MemoryDepChecker {
   /// Check whether the dependencies between the accesses are safe.
   ///
   /// Only checks sets with elements in \p CheckDeps.
-  bool areDepsSafe(DepCandidates &AccessSets, MemAccessInfoList &CheckDeps,
-   const DenseMap>
-   &UnderlyingObjects);
+  bool areDepsSafe(const DepCandidates &AccessSets,
+   const MemAccessInfoList &CheckDeps);
 
   /// No memory dependence was encountered that would inhibit
   /// vectorization.
@@ -351,11 +350,8 @@ class MemoryDepChecker {
   /// element access it records this distance in \p MinDepDistBytes (if this
   /// distance is smaller than any other distance encountered so far).
   /// Otherwise, this function returns true signaling a possible dependence.
-  Dependence::DepType
-  isDependent(const MemAccessInfo &A, unsigned AIdx, const MemAccessInfo &B,
-  unsigned BIdx,
-  const DenseMap>
-  &UnderlyingObjects);
+  Dependence::DepType isDependent(const MemAccessInfo &A, unsigned AIdx,
+  const MemAccessInfo &B, unsigned BIdx);
 
   /// Check whether the data dependence could prevent store-load
   /// forwarding.
@@ -392,11 +388,9 @@ class MemoryDepChecker {
   /// determined, or a struct containing (Distance, Stride, TypeSize, AIsWrite,
   /// BIsWrite).
   std::variant
-  getDependenceDistanceStrideAndSize(
-  const MemAccessInfo &A, Instruction *AInst, const MemAccessInfo &B,
-  Instruction *BInst,
-  const DenseMap>
-  &UnderlyingObjects);
+  getDependenceDistanceStrideAndSize(const MemAccessInfo &A, Instruction 
*AInst,
+ const MemAccessInfo &B,
+ Instruction *BInst);
 };
 
 class RuntimePointerChecking;
@@ -797,7 +791,8 @@ replaceSymbolicStrideSCEV(PredicatedScalarEvolution &PSE,
   Value *Ptr);
 
 /// If the pointer has a constant stride return it in units of the access type
-/// size.  Otherwise return std::nullopt.
+/// size. If the pointer is loop-invariant, return 0. Otherwise return
+/// std::nullopt.
 //

[llvm-branch-commits] [llvm] [LAA] Refine stride checks for SCEVs during dependence analysis. (#99… (PR #102201)

2024-08-06 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/102201
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [InstCombine] Don't look at ConstantData users (PR #103302)

2024-08-13 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn approved this pull request.

LGTM, thanks for the fix

https://github.com/llvm/llvm-project/pull/103302
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79175 (PR #80274)

2024-02-05 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn approved this pull request.

LGTM as this fixes a miscompile

https://github.com/llvm/llvm-project/pull/80274
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-07 Thread Florian Hahn via llvm-branch-commits


@@ -857,11 +857,7 @@ void VPlan::execute(VPTransformState *State) {
 Phi = cast(State->get(R.getVPSingleValue(), 0));
   } else {
 auto *WidenPhi = cast(&R);
-// TODO: Split off the case that all users of a pointer phi are scalar
-// from the VPWidenPointerInductionRecipe.
-if (WidenPhi->onlyScalarsGenerated(State->VF.isScalable()))
-  continue;
-
+assert(!WidenPhi->onlyScalarsGenerated(State->VF.isScalable()));

fhahn wrote:

Added, thanks!

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-07 Thread Florian Hahn via llvm-branch-commits


@@ -537,6 +542,30 @@ void VPlanTransforms::optimizeInductions(VPlan &Plan, 
ScalarEvolution &SE) {
   bool HasOnlyVectorVFs = !Plan.hasVF(ElementCount::getFixed(1));
   VPBasicBlock::iterator InsertPt = HeaderVPBB->getFirstNonPhi();
   for (VPRecipeBase &Phi : HeaderVPBB->phis()) {

fhahn wrote:

Added a comment, thanks!

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-07 Thread Florian Hahn via llvm-branch-commits


@@ -489,6 +490,23 @@ Value *VPInstruction::generateInstruction(VPTransformState 
&State,
 
 return ReducedPartRdx;
   }
+  case VPInstruction::PtrAdd: {
+if (vputils::onlyFirstLaneUsed(this)) {
+  auto *P = Builder.CreatePtrAdd(
+  State.get(getOperand(0), VPIteration(Part, 0)),
+  State.get(getOperand(1), VPIteration(Part, 0)), Name);
+  State.set(this, P, VPIteration(Part, 0));
+} else {
+  for (unsigned Lane = 0; Lane != State.VF.getKnownMinValue(); ++Lane) {
+Value *P = Builder.CreatePtrAdd(
+State.get(getOperand(0), VPIteration(Part, Lane)),
+State.get(getOperand(1), VPIteration(Part, Lane)), Name);
+
+State.set(this, P, VPIteration(Part, Lane));
+  }
+}
+return nullptr;

fhahn wrote:

Updated to split into generate for scalars (per lane, possibly optimizing 
depending on onlyFirstLaneUseD) and generate for vectors (per-part)

Some of it could be done separately or as part of 
https://github.com/llvm/llvm-project/pull/80271, but would be good to agree on 
the overall structure first then land separately as makes sense.

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-07 Thread Florian Hahn via llvm-branch-commits


@@ -546,9 +575,10 @@ void VPlanTransforms::optimizeInductions(VPlan &Plan, 
ScalarEvolution &SE) {
   continue;
 
 const InductionDescriptor &ID = WideIV->getInductionDescriptor();
-VPValue *Steps = createScalarIVSteps(Plan, ID, SE, WideIV->getTruncInst(),
- WideIV->getStartValue(),
- WideIV->getStepValue(), InsertPt);
+VPValue *Steps = createScalarIVSteps(
+Plan, ID.getKind(), SE, WideIV->getTruncInst(), 
WideIV->getStartValue(),
+WideIV->getStepValue(), ID.getInductionOpcode(), InsertPt,
+dyn_cast_or_null(ID.getInductionBinOp()));

fhahn wrote:

Adjusted, thanks! Can spli off moving induction opcode field separately.

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-07 Thread Florian Hahn via llvm-branch-commits


@@ -2503,6 +2504,12 @@ class VPDerivedIVRecipe : public VPSingleDefRecipe {
 dyn_cast_or_null(IndDesc.getInductionBinOp()),
 Start, CanonicalIV, Step) {}
 
+  VPDerivedIVRecipe(InductionDescriptor::InductionKind Kind, VPValue *Start,
+VPCanonicalIVPHIRecipe *CanonicalIV, VPValue *Step,
+FPMathOperator *FPBinOp)

fhahn wrote:

Made the private one public and removed this one here, thanks!

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-07 Thread Florian Hahn via llvm-branch-commits


@@ -515,6 +533,8 @@ void VPInstruction::execute(VPTransformState &State) {
 State.Builder.setFastMathFlags(getFastMathFlags());
   for (unsigned Part = 0; Part < State.UF; ++Part) {
 Value *GeneratedValue = generateInstruction(State, Part);
+if (!GeneratedValue)
+  continue;
 if (!hasResult())
   continue;

fhahn wrote:

Completely reworked this, the check for !GeneratedValue is gone now

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-07 Thread Florian Hahn via llvm-branch-commits


@@ -537,6 +542,30 @@ void VPlanTransforms::optimizeInductions(VPlan &Plan, 
ScalarEvolution &SE) {
   bool HasOnlyVectorVFs = !Plan.hasVF(ElementCount::getFixed(1));
   VPBasicBlock::iterator InsertPt = HeaderVPBB->getFirstNonPhi();
   for (VPRecipeBase &Phi : HeaderVPBB->phis()) {
+if (auto *PtrIV = dyn_cast(&Phi)) {
+  if (!PtrIV->onlyScalarsGenerated(Plan.hasScalableVF()))
+continue;
+
+  const InductionDescriptor &ID = PtrIV->getInductionDescriptor();
+  VPValue *StartV = Plan.getVPValueOrAddLiveIn(
+  ConstantInt::get(ID.getStep()->getType(), 0));
+  VPValue *StepV = PtrIV->getOperand(1);
+  VPRecipeBase *Steps =

fhahn wrote:

Will do separately, thanks!

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-07 Thread Florian Hahn via llvm-branch-commits


@@ -489,15 +489,18 @@ void VPlanTransforms::removeDeadRecipes(VPlan &Plan) {
   }
 }
 
-static VPValue *createScalarIVSteps(VPlan &Plan, const InductionDescriptor &ID,
+static VPValue *createScalarIVSteps(VPlan &Plan,
+InductionDescriptor::InductionKind Kind,
 ScalarEvolution &SE, Instruction *TruncI,
 VPValue *StartV, VPValue *Step,
-VPBasicBlock::iterator IP) {
+Instruction::BinaryOps InductionOpcode,
+VPBasicBlock::iterator IP,
+FPMathOperator *FPBinOp = nullptr) {
   VPBasicBlock *HeaderVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
   VPCanonicalIVPHIRecipe *CanonicalIV = Plan.getCanonicalIV();
   VPSingleDefRecipe *BaseIV = CanonicalIV;
-  if (!CanonicalIV->isCanonical(ID.getKind(), StartV, Step)) {
-BaseIV = new VPDerivedIVRecipe(ID, StartV, CanonicalIV, Step);
+  if (!CanonicalIV->isCanonical(Kind, StartV, Step)) {
+BaseIV = new VPDerivedIVRecipe(Kind, StartV, CanonicalIV, Step, FPBinOp);

fhahn wrote:

Yes, can split off!

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-07 Thread Florian Hahn via llvm-branch-commits


@@ -515,6 +533,8 @@ void VPInstruction::execute(VPTransformState &State) {
 State.Builder.setFastMathFlags(getFastMathFlags());
   for (unsigned Part = 0; Part < State.UF; ++Part) {
 Value *GeneratedValue = generateInstruction(State, Part);
+if (!GeneratedValue)
+  continue;
 if (!hasResult())
   continue;
 assert(GeneratedValue && "generateInstruction must produce a value");

fhahn wrote:

Reworked now, the check for !GeneratedValue is gone now

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-07 Thread Florian Hahn via llvm-branch-commits


@@ -537,6 +542,30 @@ void VPlanTransforms::optimizeInductions(VPlan &Plan, 
ScalarEvolution &SE) {
   bool HasOnlyVectorVFs = !Plan.hasVF(ElementCount::getFixed(1));
   VPBasicBlock::iterator InsertPt = HeaderVPBB->getFirstNonPhi();
   for (VPRecipeBase &Phi : HeaderVPBB->phis()) {
+if (auto *PtrIV = dyn_cast(&Phi)) {
+  if (!PtrIV->onlyScalarsGenerated(Plan.hasScalableVF()))
+continue;
+
+  const InductionDescriptor &ID = PtrIV->getInductionDescriptor();
+  VPValue *StartV = Plan.getVPValueOrAddLiveIn(
+  ConstantInt::get(ID.getStep()->getType(), 0));

fhahn wrote:

The start value of the pointer induction is the pointer base, but here we need 
the start value for the generate offsets.

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [TBAA] Only clear TBAAStruct if field can be extracted. (PR #81285)

2024-02-09 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn created https://github.com/llvm/llvm-project/pull/81285

Retain TBAAStruct if we fail to match the access to a single field. All users 
at the moment use this when using the full size of the original access. SROA 
also retains the original TBAAStruct when accessing parts at offset 0.

Motivation for this and follow-on patches is to improve codegen for libc++, 
where using memcpy limits optimizations, like vectorization for code iteration 
over std::vector>: https://godbolt.org/z/f3vqYos3c

Depends on https://github.com/llvm/llvm-project/pull/81284

>From 99cf032dfabb21b820559bae61d2354e56336fdd Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Fri, 9 Feb 2024 16:25:32 +
Subject: [PATCH] [TBAA] Only clear TBAAStruct if field can be extracted.

Retain TBAAStruct if we fail to match the access to a single field. All
users at the moment use this when using the full size of the original
access. SROA also retains the original TBAAStruct when accessing parts
at offset 0.

Motivation for this and follow-on patches is to improve codegen for
libc++, where using memcpy limits optimizations, like vectorization for
code iteration over std::vector>:
https://godbolt.org/z/f3vqYos3c

Depends on https://github.com/llvm/llvm-project/pull/81284
---
 llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp   | 8 +---
 llvm/test/Transforms/InstCombine/struct-assign-tbaa.ll | 5 +++--
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp 
b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
index edc08cde686f1f..bfd70414c0340c 100644
--- a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
@@ -821,13 +821,15 @@ MDNode *AAMDNodes::extendToTBAA(MDNode *MD, ssize_t Len) {
 AAMDNodes AAMDNodes::adjustForAccess(unsigned AccessSize) {
   AAMDNodes New = *this;
   MDNode *M = New.TBAAStruct;
-  New.TBAAStruct = nullptr;
   if (M && M->getNumOperands() == 3 && M->getOperand(0) &&
   mdconst::hasa(M->getOperand(0)) &&
   mdconst::extract(M->getOperand(0))->isZero() &&
   M->getOperand(1) && mdconst::hasa(M->getOperand(1)) &&
-  mdconst::extract(M->getOperand(1))->getValue() == 
AccessSize &&
-  M->getOperand(2) && isa(M->getOperand(2)))
+  mdconst::extract(M->getOperand(1))->getValue() ==
+  AccessSize &&
+  M->getOperand(2) && isa(M->getOperand(2))) {
+New.TBAAStruct = nullptr;
 New.TBAA = cast(M->getOperand(2));
+  }
   return New;
 }
diff --git a/llvm/test/Transforms/InstCombine/struct-assign-tbaa.ll 
b/llvm/test/Transforms/InstCombine/struct-assign-tbaa.ll
index 1042c413fbb7bb..996d2c0e67e165 100644
--- a/llvm/test/Transforms/InstCombine/struct-assign-tbaa.ll
+++ b/llvm/test/Transforms/InstCombine/struct-assign-tbaa.ll
@@ -38,8 +38,8 @@ define ptr @test2() {
 define void @test3_multiple_fields(ptr nocapture %a, ptr nocapture %b) {
 ; CHECK-LABEL: @test3_multiple_fields(
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:[[TMP0:%.*]] = load i64, ptr [[B:%.*]], align 4
-; CHECK-NEXT:store i64 [[TMP0]], ptr [[A:%.*]], align 4
+; CHECK-NEXT:[[TMP0:%.*]] = load i64, ptr [[B:%.*]], align 4, !tbaa.struct 
[[TBAA_STRUCT3:![0-9]+]]
+; CHECK-NEXT:store i64 [[TMP0]], ptr [[A:%.*]], align 4, !tbaa.struct 
[[TBAA_STRUCT3]]
 ; CHECK-NEXT:ret void
 ;
 entry:
@@ -86,4 +86,5 @@ entry:
 ; CHECK: [[TBAA0]] = !{[[META1:![0-9]+]], [[META1]], i64 0}
 ; CHECK: [[META1]] = !{!"float", [[META2:![0-9]+]]}
 ; CHECK: [[META2]] = !{!"Simple C/C++ TBAA"}
+; CHECK: [[TBAA_STRUCT3]] = !{i64 0, i64 4, [[TBAA0]], i64 4, i64 4, [[TBAA0]]}
 ;.

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SROA] Use !tbaa instead of !tbaa.struct if op matches field. (PR #81289)

2024-02-09 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn created https://github.com/llvm/llvm-project/pull/81289

If a split memory access introduced by SROA accesses precisely a single field 
of the original operation's !tbaa.struct, use the !tbaa tag for the accessed 
field directly instead of the full !tbaa.struct.

InstCombine already had a similar logic.

Motivation for this and follow-on patches is to improve codegen for libc++, 
where using memcpy limits optimizations, like vectorization for code iteration 
over std::vector>: https://godbolt.org/z/f3vqYos3c

Depends on https://github.com/llvm/llvm-project/pull/81285.

>From 90639e9131670863ebb4c199a9861b2b0094d601 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Fri, 9 Feb 2024 15:17:09 +
Subject: [PATCH] [SROA] Use !tbaa instead of !tbaa.struct if op matches field.

If a split memory access introduced by SROA accesses precisely a single
field of the original operation's !tbaa.struct, use the !tbaa tag for
the accessed field directly instead of the full !tbaa.struct.

InstCombine already had a similar logic.

Motivation for this and follow-on patches is to improve codegen for
libc++, where using memcpy limits optimizations, like vectorization for
code iteration over std::vector>:
https://godbolt.org/z/f3vqYos3c

Depends on https://github.com/llvm/llvm-project/pull/81285.
---
 llvm/include/llvm/IR/Metadata.h  |  2 +
 llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp | 13 ++
 llvm/lib/Transforms/Scalar/SROA.cpp  | 48 ++--
 llvm/test/Transforms/SROA/tbaa-struct2.ll| 21 -
 llvm/test/Transforms/SROA/tbaa-struct3.ll| 16 +++
 5 files changed, 67 insertions(+), 33 deletions(-)

diff --git a/llvm/include/llvm/IR/Metadata.h b/llvm/include/llvm/IR/Metadata.h
index 6f23ac44dee968..33363a271d4823 100644
--- a/llvm/include/llvm/IR/Metadata.h
+++ b/llvm/include/llvm/IR/Metadata.h
@@ -849,6 +849,8 @@ struct AAMDNodes {
   /// If his AAMDNode has !tbaa.struct and \p AccessSize matches the size of 
the
   /// field at offset 0, get the TBAA tag describing the accessed field.
   AAMDNodes adjustForAccess(unsigned AccessSize);
+  AAMDNodes adjustForAccess(size_t Offset, Type *AccessTy,
+const DataLayout &DL);
 };
 
 // Specialize DenseMapInfo for AAMDNodes.
diff --git a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp 
b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
index bfd70414c0340c..b2dc451d581939 100644
--- a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
@@ -833,3 +833,16 @@ AAMDNodes AAMDNodes::adjustForAccess(unsigned AccessSize) {
   }
   return New;
 }
+
+AAMDNodes AAMDNodes::adjustForAccess(size_t Offset, Type *AccessTy,
+ const DataLayout &DL) {
+
+  AAMDNodes New = shift(Offset);
+  if (!DL.typeSizeEqualsStoreSize(AccessTy))
+return New;
+  TypeSize Size = DL.getTypeStoreSize(AccessTy);
+  if (Size.isScalable())
+return New;
+
+  return New.adjustForAccess(Size.getKnownMinValue());
+}
diff --git a/llvm/lib/Transforms/Scalar/SROA.cpp 
b/llvm/lib/Transforms/Scalar/SROA.cpp
index 138dc38b5c14ce..f24cbbc1fe0591 100644
--- a/llvm/lib/Transforms/Scalar/SROA.cpp
+++ b/llvm/lib/Transforms/Scalar/SROA.cpp
@@ -2914,7 +2914,8 @@ class AllocaSliceRewriter : public 
InstVisitor {
 
   // Do this after copyMetadataForLoad() to preserve the TBAA shift.
   if (AATags)
-NewLI->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+NewLI->setAAMetadata(AATags.adjustForAccess(
+NewBeginOffset - BeginOffset, NewLI->getType(), DL));
 
   // Try to preserve nonnull metadata
   V = NewLI;
@@ -2936,7 +2937,9 @@ class AllocaSliceRewriter : public 
InstVisitor {
   IRB.CreateAlignedLoad(TargetTy, getNewAllocaSlicePtr(IRB, LTy),
 getSliceAlign(), LI.isVolatile(), 
LI.getName());
   if (AATags)
-NewLI->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+NewLI->setAAMetadata(AATags.adjustForAccess(
+NewBeginOffset - BeginOffset, NewLI->getType(), DL));
+
   if (LI.isVolatile())
 NewLI->setAtomic(LI.getOrdering(), LI.getSyncScopeID());
   NewLI->copyMetadata(LI, {LLVMContext::MD_mem_parallel_loop_access,
@@ -3011,7 +3014,8 @@ class AllocaSliceRewriter : public 
InstVisitor {
 Store->copyMetadata(SI, {LLVMContext::MD_mem_parallel_loop_access,
  LLVMContext::MD_access_group});
 if (AATags)
-  Store->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+  Store->setAAMetadata(AATags.adjustForAccess(NewBeginOffset - BeginOffset,
+  V->getType(), DL));
 Pass.DeadInsts.push_back(&SI);
 
 // NOTE: Careful to use OrigV rather than V.
@@ -3038,7 +3042,8 @@ class AllocaSliceRewriter : public 
InstVisitor {
 Store->copyMetadata(SI, {LLVMContext::MD_mem_parallel_loop_access,

[llvm-branch-commits] [llvm] [TBAA] Only clear TBAAStruct if field can be extracted. (PR #81285)

2024-02-09 Thread Florian Hahn via llvm-branch-commits


@@ -821,13 +821,15 @@ MDNode *AAMDNodes::extendToTBAA(MDNode *MD, ssize_t Len) {
 AAMDNodes AAMDNodes::adjustForAccess(unsigned AccessSize) {
   AAMDNodes New = *this;
   MDNode *M = New.TBAAStruct;
-  New.TBAAStruct = nullptr;
   if (M && M->getNumOperands() == 3 && M->getOperand(0) &&

fhahn wrote:

Yep, I left this to here to keep the changes small, I'll soon share this one in 
the chain.

https://github.com/llvm/llvm-project/pull/81285
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [TBAA] Use !tbaa for first accessed field, even if there are others. (PR #81313)

2024-02-09 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn created https://github.com/llvm/llvm-project/pull/81313

Motivation for this and follow-on patches is to improve codegen for libc++, 
where using memcpy limits optimizations, like vectorization for code iteration 
over std::vector>: https://godbolt.org/z/f3vqYos3c

Depends on https://github.com/llvm/llvm-project/pull/81289.

>From e879ab07a6b39d7cf47fbc3c17ff25918cdee628 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Fri, 9 Feb 2024 16:48:26 +
Subject: [PATCH] [TBAA] Use !tbaa for first accessed field, even if there are
 others.

Motivation for this and follow-on patches is to improve codegen for
libc++, where using memcpy limits optimizations, like vectorization for
code iteration over std::vector>:
https://godbolt.org/z/f3vqYos3c

Depends on https://github.com/llvm/llvm-project/pull/81289.
---
 llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp |  3 +--
 llvm/test/Transforms/SROA/tbaa-struct2.ll| 21 +++
 llvm/test/Transforms/SROA/tbaa-struct3.ll| 28 ++--
 3 files changed, 25 insertions(+), 27 deletions(-)

diff --git a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp 
b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
index b2dc451d581939..25ac01db7633ee 100644
--- a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
@@ -821,8 +821,7 @@ MDNode *AAMDNodes::extendToTBAA(MDNode *MD, ssize_t Len) {
 AAMDNodes AAMDNodes::adjustForAccess(unsigned AccessSize) {
   AAMDNodes New = *this;
   MDNode *M = New.TBAAStruct;
-  if (M && M->getNumOperands() == 3 && M->getOperand(0) &&
-  mdconst::hasa(M->getOperand(0)) &&
+  if (M && M->getOperand(0) && mdconst::hasa(M->getOperand(0)) &&
   mdconst::extract(M->getOperand(0))->isZero() &&
   M->getOperand(1) && mdconst::hasa(M->getOperand(1)) &&
   mdconst::extract(M->getOperand(1))->getValue() ==
diff --git a/llvm/test/Transforms/SROA/tbaa-struct2.ll 
b/llvm/test/Transforms/SROA/tbaa-struct2.ll
index 02c99a2b329457..545fa47eecb2ce 100644
--- a/llvm/test/Transforms/SROA/tbaa-struct2.ll
+++ b/llvm/test/Transforms/SROA/tbaa-struct2.ll
@@ -11,11 +11,11 @@ declare double @subcall(double %g, i32 %m)
 define double @bar(ptr %wishart) {
 ; CHECK-LABEL: @bar(
 ; CHECK-NEXT:[[TMP_SROA_3:%.*]] = alloca [4 x i8], align 4
-; CHECK-NEXT:[[TMP_SROA_0_0_COPYLOAD:%.*]] = load double, ptr 
[[WISHART:%.*]], align 8, !tbaa.struct [[TBAA_STRUCT0:![0-9]+]]
+; CHECK-NEXT:[[TMP_SROA_0_0_COPYLOAD:%.*]] = load double, ptr 
[[WISHART:%.*]], align 8, !tbaa [[TBAA0:![0-9]+]]
 ; CHECK-NEXT:[[TMP_SROA_2_0_WISHART_SROA_IDX:%.*]] = getelementptr 
inbounds i8, ptr [[WISHART]], i64 8
-; CHECK-NEXT:[[TMP_SROA_2_0_COPYLOAD:%.*]] = load i32, ptr 
[[TMP_SROA_2_0_WISHART_SROA_IDX]], align 8, !tbaa [[TBAA5:![0-9]+]]
+; CHECK-NEXT:[[TMP_SROA_2_0_COPYLOAD:%.*]] = load i32, ptr 
[[TMP_SROA_2_0_WISHART_SROA_IDX]], align 8, !tbaa [[TBAA4:![0-9]+]]
 ; CHECK-NEXT:[[TMP_SROA_3_0_WISHART_SROA_IDX:%.*]] = getelementptr 
inbounds i8, ptr [[WISHART]], i64 12
-; CHECK-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr align 4 [[TMP_SROA_3]], 
ptr align 4 [[TMP_SROA_3_0_WISHART_SROA_IDX]], i64 4, i1 false), !tbaa.struct 
[[TBAA_STRUCT7:![0-9]+]]
+; CHECK-NEXT:call void @llvm.memcpy.p0.p0.i64(ptr align 4 [[TMP_SROA_3]], 
ptr align 4 [[TMP_SROA_3_0_WISHART_SROA_IDX]], i64 4, i1 false), !tbaa.struct 
[[TBAA_STRUCT6:![0-9]+]]
 ; CHECK-NEXT:[[CALL:%.*]] = call double @subcall(double 
[[TMP_SROA_0_0_COPYLOAD]], i32 [[TMP_SROA_2_0_COPYLOAD]])
 ; CHECK-NEXT:ret double [[CALL]]
 ;
@@ -38,14 +38,13 @@ define double @bar(ptr %wishart) {
 ;.
 ; CHECK: attributes #[[ATTR0:[0-9]+]] = { nocallback nofree nounwind 
willreturn memory(argmem: readwrite) }
 ;.
-; CHECK: [[TBAA_STRUCT0]] = !{i64 0, i64 8, [[META1:![0-9]+]], i64 8, i64 4, 
[[TBAA5]]}
-; CHECK: [[META1]] = !{[[META2:![0-9]+]], [[META2]], i64 0}
-; CHECK: [[META2]] = !{!"double", [[META3:![0-9]+]], i64 0}
-; CHECK: [[META3]] = !{!"omnipotent char", [[META4:![0-9]+]], i64 0}
-; CHECK: [[META4]] = !{!"Simple C++ TBAA"}
-; CHECK: [[TBAA5]] = !{[[META6:![0-9]+]], [[META6]], i64 0}
-; CHECK: [[META6]] = !{!"int", [[META3]], i64 0}
-; CHECK: [[TBAA_STRUCT7]] = !{}
+; CHECK: [[TBAA0]] = !{[[META1:![0-9]+]], [[META1]], i64 0}
+; CHECK: [[META1]] = !{!"double", [[META2:![0-9]+]], i64 0}
+; CHECK: [[META2]] = !{!"omnipotent char", [[META3:![0-9]+]], i64 0}
+; CHECK: [[META3]] = !{!"Simple C++ TBAA"}
+; CHECK: [[TBAA4]] = !{[[META5:![0-9]+]], [[META5]], i64 0}
+; CHECK: [[META5]] = !{!"int", [[META2]], i64 0}
+; CHECK: [[TBAA_STRUCT6]] = !{}
 ;.
 ;; NOTE: These prefixes are unused and the list is autogenerated. Do not add 
tests below this line:
 ; CHECK-MODIFY-CFG: {{.*}}
diff --git a/llvm/test/Transforms/SROA/tbaa-struct3.ll 
b/llvm/test/Transforms/SROA/tbaa-struct3.ll
index 603e7d708647fc..68553d9b1a270b 100644
--- a/llvm/test/Transforms/SROA/tbaa-struct3.ll
+++ b/llvm/test/Transforms/SROA/tbaa-struct3.ll
@@ -7,9 +7,9 @@ define void @load

[llvm-branch-commits] [llvm] [TBAA] Only clear TBAAStruct if field can be extracted. (PR #81285)

2024-02-09 Thread Florian Hahn via llvm-branch-commits


@@ -821,13 +821,15 @@ MDNode *AAMDNodes::extendToTBAA(MDNode *MD, ssize_t Len) {
 AAMDNodes AAMDNodes::adjustForAccess(unsigned AccessSize) {
   AAMDNodes New = *this;
   MDNode *M = New.TBAAStruct;
-  New.TBAAStruct = nullptr;
   if (M && M->getNumOperands() == 3 && M->getOperand(0) &&

fhahn wrote:

Here it is: #81313

https://github.com/llvm/llvm-project/pull/81285
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SROA] Use !tbaa instead of !tbaa.struct if op matches field. (PR #81289)

2024-02-09 Thread Florian Hahn via llvm-branch-commits


@@ -7,9 +7,9 @@ define void @load_store_transfer_split_struct_tbaa_2_float(ptr 
dereferenceable(2
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:[[TMP0:%.*]] = bitcast float [[A]] to i32
 ; CHECK-NEXT:[[TMP1:%.*]] = bitcast float [[B]] to i32
-; CHECK-NEXT:store i32 [[TMP0]], ptr [[RES]], align 4
+; CHECK-NEXT:store i32 [[TMP0]], ptr [[RES]], align 4, !tbaa.struct 
[[TBAA_STRUCT0:![0-9]+]]

fhahn wrote:

Yes, this should be solved by a separate improvement: #81313

https://github.com/llvm/llvm-project/pull/81289
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SROA] Use !tbaa instead of !tbaa.struct if op matches field. (PR #81289)

2024-02-09 Thread Florian Hahn via llvm-branch-commits

fhahn wrote:

> Hmm. 10 changes + 1 new usage of setAAMetaData, But only 4 relevant changes 
> in tests.. That part of SROA seems to lack some testing ?

Yes, will add the missing coverage, just wanted to make sure this makes sense 
in general beforehand

https://github.com/llvm/llvm-project/pull/81289
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [TBAA] Use !tbaa for first accessed field if it is an exact match in offset and size. (PR #81313)

2024-02-12 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/81313
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [TBAA] Use !tbaa for first accessed field if it is an exact match in offset and size. (PR #81313)

2024-02-12 Thread Florian Hahn via llvm-branch-commits

fhahn wrote:

> lgtm
> 
> Maybe rephrase the commit message to something like:
> 
> ```
>  [tbaa] Use !tbaa for first accessed field if it is an exact match in offset 
> and size. 
> ```

Updated, thanks! It would be great if you could take another look at 
https://github.com/llvm/llvm-project/pull/81289 in case you missed my response 
to your comment, as this PR depends on 
https://github.com/llvm/llvm-project/pull/81289

https://github.com/llvm/llvm-project/pull/81313
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SROA] Use !tbaa instead of !tbaa.struct if op matches field. (PR #81289)

2024-02-14 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/81289

>From 90639e9131670863ebb4c199a9861b2b0094d601 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Fri, 9 Feb 2024 15:17:09 +
Subject: [PATCH] [SROA] Use !tbaa instead of !tbaa.struct if op matches field.

If a split memory access introduced by SROA accesses precisely a single
field of the original operation's !tbaa.struct, use the !tbaa tag for
the accessed field directly instead of the full !tbaa.struct.

InstCombine already had a similar logic.

Motivation for this and follow-on patches is to improve codegen for
libc++, where using memcpy limits optimizations, like vectorization for
code iteration over std::vector>:
https://godbolt.org/z/f3vqYos3c

Depends on https://github.com/llvm/llvm-project/pull/81285.
---
 llvm/include/llvm/IR/Metadata.h  |  2 +
 llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp | 13 ++
 llvm/lib/Transforms/Scalar/SROA.cpp  | 48 ++--
 llvm/test/Transforms/SROA/tbaa-struct2.ll| 21 -
 llvm/test/Transforms/SROA/tbaa-struct3.ll| 16 +++
 5 files changed, 67 insertions(+), 33 deletions(-)

diff --git a/llvm/include/llvm/IR/Metadata.h b/llvm/include/llvm/IR/Metadata.h
index 6f23ac44dee968..33363a271d4823 100644
--- a/llvm/include/llvm/IR/Metadata.h
+++ b/llvm/include/llvm/IR/Metadata.h
@@ -849,6 +849,8 @@ struct AAMDNodes {
   /// If his AAMDNode has !tbaa.struct and \p AccessSize matches the size of 
the
   /// field at offset 0, get the TBAA tag describing the accessed field.
   AAMDNodes adjustForAccess(unsigned AccessSize);
+  AAMDNodes adjustForAccess(size_t Offset, Type *AccessTy,
+const DataLayout &DL);
 };
 
 // Specialize DenseMapInfo for AAMDNodes.
diff --git a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp 
b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
index bfd70414c0340c..b2dc451d581939 100644
--- a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
@@ -833,3 +833,16 @@ AAMDNodes AAMDNodes::adjustForAccess(unsigned AccessSize) {
   }
   return New;
 }
+
+AAMDNodes AAMDNodes::adjustForAccess(size_t Offset, Type *AccessTy,
+ const DataLayout &DL) {
+
+  AAMDNodes New = shift(Offset);
+  if (!DL.typeSizeEqualsStoreSize(AccessTy))
+return New;
+  TypeSize Size = DL.getTypeStoreSize(AccessTy);
+  if (Size.isScalable())
+return New;
+
+  return New.adjustForAccess(Size.getKnownMinValue());
+}
diff --git a/llvm/lib/Transforms/Scalar/SROA.cpp 
b/llvm/lib/Transforms/Scalar/SROA.cpp
index 138dc38b5c14ce..f24cbbc1fe0591 100644
--- a/llvm/lib/Transforms/Scalar/SROA.cpp
+++ b/llvm/lib/Transforms/Scalar/SROA.cpp
@@ -2914,7 +2914,8 @@ class AllocaSliceRewriter : public 
InstVisitor {
 
   // Do this after copyMetadataForLoad() to preserve the TBAA shift.
   if (AATags)
-NewLI->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+NewLI->setAAMetadata(AATags.adjustForAccess(
+NewBeginOffset - BeginOffset, NewLI->getType(), DL));
 
   // Try to preserve nonnull metadata
   V = NewLI;
@@ -2936,7 +2937,9 @@ class AllocaSliceRewriter : public 
InstVisitor {
   IRB.CreateAlignedLoad(TargetTy, getNewAllocaSlicePtr(IRB, LTy),
 getSliceAlign(), LI.isVolatile(), 
LI.getName());
   if (AATags)
-NewLI->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+NewLI->setAAMetadata(AATags.adjustForAccess(
+NewBeginOffset - BeginOffset, NewLI->getType(), DL));
+
   if (LI.isVolatile())
 NewLI->setAtomic(LI.getOrdering(), LI.getSyncScopeID());
   NewLI->copyMetadata(LI, {LLVMContext::MD_mem_parallel_loop_access,
@@ -3011,7 +3014,8 @@ class AllocaSliceRewriter : public 
InstVisitor {
 Store->copyMetadata(SI, {LLVMContext::MD_mem_parallel_loop_access,
  LLVMContext::MD_access_group});
 if (AATags)
-  Store->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+  Store->setAAMetadata(AATags.adjustForAccess(NewBeginOffset - BeginOffset,
+  V->getType(), DL));
 Pass.DeadInsts.push_back(&SI);
 
 // NOTE: Careful to use OrigV rather than V.
@@ -3038,7 +3042,8 @@ class AllocaSliceRewriter : public 
InstVisitor {
 Store->copyMetadata(SI, {LLVMContext::MD_mem_parallel_loop_access,
  LLVMContext::MD_access_group});
 if (AATags)
-  Store->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+  Store->setAAMetadata(AATags.adjustForAccess(NewBeginOffset - BeginOffset,
+  V->getType(), DL));
 
 migrateDebugInfo(&OldAI, IsSplit, NewBeginOffset * 8, SliceSize * 8, &SI,
  Store, Store->getPointerOperand(),
@@ -3097,8 +3102,10 @@ class AllocaSliceRewriter : public 
InstVisitor {
 }
 NewSI->copyMetadata(

[llvm-branch-commits] [llvm] [SROA] Use !tbaa instead of !tbaa.struct if op matches field. (PR #81289)

2024-02-15 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/81289

>From 90639e9131670863ebb4c199a9861b2b0094d601 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Fri, 9 Feb 2024 15:17:09 +
Subject: [PATCH 1/2] [SROA] Use !tbaa instead of !tbaa.struct if op matches
 field.

If a split memory access introduced by SROA accesses precisely a single
field of the original operation's !tbaa.struct, use the !tbaa tag for
the accessed field directly instead of the full !tbaa.struct.

InstCombine already had a similar logic.

Motivation for this and follow-on patches is to improve codegen for
libc++, where using memcpy limits optimizations, like vectorization for
code iteration over std::vector>:
https://godbolt.org/z/f3vqYos3c

Depends on https://github.com/llvm/llvm-project/pull/81285.
---
 llvm/include/llvm/IR/Metadata.h  |  2 +
 llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp | 13 ++
 llvm/lib/Transforms/Scalar/SROA.cpp  | 48 ++--
 llvm/test/Transforms/SROA/tbaa-struct2.ll| 21 -
 llvm/test/Transforms/SROA/tbaa-struct3.ll| 16 +++
 5 files changed, 67 insertions(+), 33 deletions(-)

diff --git a/llvm/include/llvm/IR/Metadata.h b/llvm/include/llvm/IR/Metadata.h
index 6f23ac44dee968..33363a271d4823 100644
--- a/llvm/include/llvm/IR/Metadata.h
+++ b/llvm/include/llvm/IR/Metadata.h
@@ -849,6 +849,8 @@ struct AAMDNodes {
   /// If his AAMDNode has !tbaa.struct and \p AccessSize matches the size of 
the
   /// field at offset 0, get the TBAA tag describing the accessed field.
   AAMDNodes adjustForAccess(unsigned AccessSize);
+  AAMDNodes adjustForAccess(size_t Offset, Type *AccessTy,
+const DataLayout &DL);
 };
 
 // Specialize DenseMapInfo for AAMDNodes.
diff --git a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp 
b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
index bfd70414c0340c..b2dc451d581939 100644
--- a/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
+++ b/llvm/lib/Analysis/TypeBasedAliasAnalysis.cpp
@@ -833,3 +833,16 @@ AAMDNodes AAMDNodes::adjustForAccess(unsigned AccessSize) {
   }
   return New;
 }
+
+AAMDNodes AAMDNodes::adjustForAccess(size_t Offset, Type *AccessTy,
+ const DataLayout &DL) {
+
+  AAMDNodes New = shift(Offset);
+  if (!DL.typeSizeEqualsStoreSize(AccessTy))
+return New;
+  TypeSize Size = DL.getTypeStoreSize(AccessTy);
+  if (Size.isScalable())
+return New;
+
+  return New.adjustForAccess(Size.getKnownMinValue());
+}
diff --git a/llvm/lib/Transforms/Scalar/SROA.cpp 
b/llvm/lib/Transforms/Scalar/SROA.cpp
index 138dc38b5c14ce..f24cbbc1fe0591 100644
--- a/llvm/lib/Transforms/Scalar/SROA.cpp
+++ b/llvm/lib/Transforms/Scalar/SROA.cpp
@@ -2914,7 +2914,8 @@ class AllocaSliceRewriter : public 
InstVisitor {
 
   // Do this after copyMetadataForLoad() to preserve the TBAA shift.
   if (AATags)
-NewLI->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+NewLI->setAAMetadata(AATags.adjustForAccess(
+NewBeginOffset - BeginOffset, NewLI->getType(), DL));
 
   // Try to preserve nonnull metadata
   V = NewLI;
@@ -2936,7 +2937,9 @@ class AllocaSliceRewriter : public 
InstVisitor {
   IRB.CreateAlignedLoad(TargetTy, getNewAllocaSlicePtr(IRB, LTy),
 getSliceAlign(), LI.isVolatile(), 
LI.getName());
   if (AATags)
-NewLI->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+NewLI->setAAMetadata(AATags.adjustForAccess(
+NewBeginOffset - BeginOffset, NewLI->getType(), DL));
+
   if (LI.isVolatile())
 NewLI->setAtomic(LI.getOrdering(), LI.getSyncScopeID());
   NewLI->copyMetadata(LI, {LLVMContext::MD_mem_parallel_loop_access,
@@ -3011,7 +3014,8 @@ class AllocaSliceRewriter : public 
InstVisitor {
 Store->copyMetadata(SI, {LLVMContext::MD_mem_parallel_loop_access,
  LLVMContext::MD_access_group});
 if (AATags)
-  Store->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+  Store->setAAMetadata(AATags.adjustForAccess(NewBeginOffset - BeginOffset,
+  V->getType(), DL));
 Pass.DeadInsts.push_back(&SI);
 
 // NOTE: Careful to use OrigV rather than V.
@@ -3038,7 +3042,8 @@ class AllocaSliceRewriter : public 
InstVisitor {
 Store->copyMetadata(SI, {LLVMContext::MD_mem_parallel_loop_access,
  LLVMContext::MD_access_group});
 if (AATags)
-  Store->setAAMetadata(AATags.shift(NewBeginOffset - BeginOffset));
+  Store->setAAMetadata(AATags.adjustForAccess(NewBeginOffset - BeginOffset,
+  V->getType(), DL));
 
 migrateDebugInfo(&OldAI, IsSplit, NewBeginOffset * 8, SliceSize * 8, &SI,
  Store, Store->getPointerOperand(),
@@ -3097,8 +3102,10 @@ class AllocaSliceRewriter : public 
InstVisitor {
 }
 NewSI->copyMeta

[llvm-branch-commits] [llvm] [SROA] Use !tbaa instead of !tbaa.struct if op matches field. (PR #81289)

2024-02-15 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/81289
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SROA] Use !tbaa instead of !tbaa.struct if op matches field. (PR #81289)

2024-02-15 Thread Florian Hahn via llvm-branch-commits


@@ -4561,6 +4577,10 @@ bool SROA::presplitLoadsAndStores(AllocaInst &AI, 
AllocaSlices &AS) {
 PStore->copyMetadata(*SI, {LLVMContext::MD_mem_parallel_loop_access,
LLVMContext::MD_access_group,
LLVMContext::MD_DIAssignID});
+
+if (AATags)
+  PStore->setAAMetadata(

fhahn wrote:

Should be now, I added a number of additional tests that should cover all cases 
here

https://github.com/llvm/llvm-project/pull/81289
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SROA] Use !tbaa instead of !tbaa.struct if op matches field. (PR #81289)

2024-02-15 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn commented:

@dobbelaj-snps Added a substantial number of tests that should cover all cases 
now in 2a9b86cc10c3883cca51a5166aad6e2b755fa958

https://github.com/llvm/llvm-project/pull/81289
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SLP] Initial vectorization of non-power-of-2 ops. (PR #77790)

2024-02-23 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/77790
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-26 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn closed https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [VPlan] Explicitly handle scalar pointer inductions. (PR #80273)

2024-02-26 Thread Florian Hahn via llvm-branch-commits

fhahn wrote:

@ayalz unfortunately I don't know how to update the target branch to 
`llvm:main`, so I went ahead and opened a new PR that's update on top of 
current `main`: https://github.com/llvm/llvm-project/pull/83068

Comments should be addressed, sorry for the inconvenience.

https://github.com/llvm/llvm-project/pull/80273
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SLP] Collect candidate VFs in vector in vectorizeStores (NFC). (PR #82793)

2024-02-28 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/82793
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SLP] Initial vectorization of non-power-of-2 ops. (PR #77790)

2024-03-01 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn closed https://github.com/llvm/llvm-project/pull/77790
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [SLP] Initial vectorization of non-power-of-2 ops. (PR #77790)

2024-03-01 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn reopened 
https://github.com/llvm/llvm-project/pull/77790
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [DSE] Delay deleting non-memory-defs until end of DSE. (#83411) (PR #84227)

2024-03-06 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn approved this pull request.

LGTM, thanks!

https://github.com/llvm/llvm-project/pull/84227
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [ARM] Update IsRestored for LR based on all returns (#82745) (PR #83129)

2024-03-20 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn approved this pull request.

LGTM, would be good to back-port.

https://github.com/llvm/llvm-project/pull/83129
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [CIR][Basic][NFC] Add the CIR language to the Language enum (PR #86072)

2024-03-21 Thread Florian Hahn via llvm-branch-commits

fhahn wrote:

Could you remove the commit-id line from the commit message, as it doesn’t seem 
relevant?

https://github.com/llvm/llvm-project/pull/86072
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [TySan] A Type Sanitizer (Clang) (PR #76260)

2024-04-18 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/76260
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [flang] [libc] [libcxx] [llvm] [mlir] [openmp] [TySan] A Type Sanitizer (Clang) (PR #76260)

2024-04-18 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/76260

error: too big or took too long to generate
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [TySan] A Type Sanitizer (Clang) (PR #76260)

2024-04-18 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/76260

>From 96912aec51f6752d211d8bd091eaad6426037050 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Thu, 18 Apr 2024 23:01:03 +0100
Subject: [PATCH 1/2] [TySan] A Type Sanitizer (Clang)

---
 clang/include/clang/Basic/Features.def |  1 +
 clang/include/clang/Basic/Sanitizers.def   |  3 ++
 clang/include/clang/Driver/SanitizerArgs.h |  1 +
 clang/lib/CodeGen/BackendUtil.cpp  |  6 +++
 clang/lib/CodeGen/CGDecl.cpp   |  3 +-
 clang/lib/CodeGen/CGDeclCXX.cpp|  4 ++
 clang/lib/CodeGen/CodeGenFunction.cpp  |  2 +
 clang/lib/CodeGen/CodeGenModule.cpp| 12 +++---
 clang/lib/CodeGen/CodeGenTBAA.cpp  |  6 ++-
 clang/lib/CodeGen/SanitizerMetadata.cpp| 44 +-
 clang/lib/CodeGen/SanitizerMetadata.h  | 13 ---
 clang/lib/Driver/SanitizerArgs.cpp | 15 +---
 clang/lib/Driver/ToolChains/CommonArgs.cpp |  6 ++-
 clang/lib/Driver/ToolChains/Darwin.cpp |  5 +++
 clang/lib/Driver/ToolChains/Linux.cpp  |  2 +
 clang/test/Driver/sanitizer-ld.c   | 23 +++
 16 files changed, 116 insertions(+), 30 deletions(-)

diff --git a/clang/include/clang/Basic/Features.def 
b/clang/include/clang/Basic/Features.def
index fe4d1c4afcca65..589739eea2734d 100644
--- a/clang/include/clang/Basic/Features.def
+++ b/clang/include/clang/Basic/Features.def
@@ -99,6 +99,7 @@ FEATURE(nullability_nullable_result, true)
 FEATURE(memory_sanitizer,
 LangOpts.Sanitize.hasOneOf(SanitizerKind::Memory |
SanitizerKind::KernelMemory))
+FEATURE(type_sanitizer, LangOpts.Sanitize.has(SanitizerKind::Type))
 FEATURE(thread_sanitizer, LangOpts.Sanitize.has(SanitizerKind::Thread))
 FEATURE(dataflow_sanitizer, LangOpts.Sanitize.has(SanitizerKind::DataFlow))
 FEATURE(scudo, LangOpts.Sanitize.hasOneOf(SanitizerKind::Scudo))
diff --git a/clang/include/clang/Basic/Sanitizers.def 
b/clang/include/clang/Basic/Sanitizers.def
index b228ffd07ee745..a482cf520620bc 100644
--- a/clang/include/clang/Basic/Sanitizers.def
+++ b/clang/include/clang/Basic/Sanitizers.def
@@ -73,6 +73,9 @@ SANITIZER("fuzzer", Fuzzer)
 // libFuzzer-required instrumentation, no linking.
 SANITIZER("fuzzer-no-link", FuzzerNoLink)
 
+// TypeSanitizer
+SANITIZER("type", Type)
+
 // ThreadSanitizer
 SANITIZER("thread", Thread)
 
diff --git a/clang/include/clang/Driver/SanitizerArgs.h 
b/clang/include/clang/Driver/SanitizerArgs.h
index 07070ec4fc0653..52b482a0e8a1a9 100644
--- a/clang/include/clang/Driver/SanitizerArgs.h
+++ b/clang/include/clang/Driver/SanitizerArgs.h
@@ -86,6 +86,7 @@ class SanitizerArgs {
   bool needsHwasanAliasesRt() const {
 return needsHwasanRt() && HwasanUseAliases;
   }
+  bool needsTysanRt() const { return Sanitizers.has(SanitizerKind::Type); }
   bool needsTsanRt() const { return Sanitizers.has(SanitizerKind::Thread); }
   bool needsMsanRt() const { return Sanitizers.has(SanitizerKind::Memory); }
   bool needsFuzzer() const { return Sanitizers.has(SanitizerKind::Fuzzer); }
diff --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index 6cc00b85664f41..1db5aca770b259 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -80,6 +80,7 @@
 #include "llvm/Transforms/Instrumentation/SanitizerBinaryMetadata.h"
 #include "llvm/Transforms/Instrumentation/SanitizerCoverage.h"
 #include "llvm/Transforms/Instrumentation/ThreadSanitizer.h"
+#include "llvm/Transforms/Instrumentation/TypeSanitizer.h"
 #include "llvm/Transforms/ObjCARC.h"
 #include "llvm/Transforms/Scalar/EarlyCSE.h"
 #include "llvm/Transforms/Scalar/GVN.h"
@@ -697,6 +698,11 @@ static void addSanitizers(const Triple &TargetTriple,
   MPM.addPass(createModuleToFunctionPassAdaptor(ThreadSanitizerPass()));
 }
 
+if (LangOpts.Sanitize.has(SanitizerKind::Type)) {
+  MPM.addPass(ModuleTypeSanitizerPass());
+  MPM.addPass(createModuleToFunctionPassAdaptor(TypeSanitizerPass()));
+}
+
 auto ASanPass = [&](SanitizerMask Mask, bool CompileKernel) {
   if (LangOpts.Sanitize.has(Mask)) {
 bool UseGlobalGC = asanUseGlobalsGC(TargetTriple, CodeGenOpts);
diff --git a/clang/lib/CodeGen/CGDecl.cpp b/clang/lib/CodeGen/CGDecl.cpp
index ce6d6d8956076e..42516fa749c830 100644
--- a/clang/lib/CodeGen/CGDecl.cpp
+++ b/clang/lib/CodeGen/CGDecl.cpp
@@ -482,7 +482,8 @@ void CodeGenFunction::EmitStaticVarDecl(const VarDecl &D,
   LocalDeclMap.find(&D)->second = Address(castedAddr, elemTy, alignment);
   CGM.setStaticLocalDeclAddress(&D, castedAddr);
 
-  CGM.getSanitizerMetadata()->reportGlobal(var, D);
+  CGM.getSanitizerMetadata()->reportGlobalToASan(var, D);
+  CGM.getSanitizerMetadata()->reportGlobalToTySan(var, D);
 
   // Emit global variable debug descriptor for static vars.
   CGDebugInfo *DI = getDebugInfo();
diff --git a/clang/lib/CodeGen/CGDeclCXX.cpp b/clang/lib/CodeGen/CGDeclCXX.cpp
index e08a1e5f42df20..08b

[llvm-branch-commits] [clang] [TySan] A Type Sanitizer (Clang) (PR #76260)

2024-04-18 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/76260

>From 96912aec51f6752d211d8bd091eaad6426037050 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Thu, 18 Apr 2024 23:01:03 +0100
Subject: [PATCH 1/2] [TySan] A Type Sanitizer (Clang)

---
 clang/include/clang/Basic/Features.def |  1 +
 clang/include/clang/Basic/Sanitizers.def   |  3 ++
 clang/include/clang/Driver/SanitizerArgs.h |  1 +
 clang/lib/CodeGen/BackendUtil.cpp  |  6 +++
 clang/lib/CodeGen/CGDecl.cpp   |  3 +-
 clang/lib/CodeGen/CGDeclCXX.cpp|  4 ++
 clang/lib/CodeGen/CodeGenFunction.cpp  |  2 +
 clang/lib/CodeGen/CodeGenModule.cpp| 12 +++---
 clang/lib/CodeGen/CodeGenTBAA.cpp  |  6 ++-
 clang/lib/CodeGen/SanitizerMetadata.cpp| 44 +-
 clang/lib/CodeGen/SanitizerMetadata.h  | 13 ---
 clang/lib/Driver/SanitizerArgs.cpp | 15 +---
 clang/lib/Driver/ToolChains/CommonArgs.cpp |  6 ++-
 clang/lib/Driver/ToolChains/Darwin.cpp |  5 +++
 clang/lib/Driver/ToolChains/Linux.cpp  |  2 +
 clang/test/Driver/sanitizer-ld.c   | 23 +++
 16 files changed, 116 insertions(+), 30 deletions(-)

diff --git a/clang/include/clang/Basic/Features.def 
b/clang/include/clang/Basic/Features.def
index fe4d1c4afcca65..589739eea2734d 100644
--- a/clang/include/clang/Basic/Features.def
+++ b/clang/include/clang/Basic/Features.def
@@ -99,6 +99,7 @@ FEATURE(nullability_nullable_result, true)
 FEATURE(memory_sanitizer,
 LangOpts.Sanitize.hasOneOf(SanitizerKind::Memory |
SanitizerKind::KernelMemory))
+FEATURE(type_sanitizer, LangOpts.Sanitize.has(SanitizerKind::Type))
 FEATURE(thread_sanitizer, LangOpts.Sanitize.has(SanitizerKind::Thread))
 FEATURE(dataflow_sanitizer, LangOpts.Sanitize.has(SanitizerKind::DataFlow))
 FEATURE(scudo, LangOpts.Sanitize.hasOneOf(SanitizerKind::Scudo))
diff --git a/clang/include/clang/Basic/Sanitizers.def 
b/clang/include/clang/Basic/Sanitizers.def
index b228ffd07ee745..a482cf520620bc 100644
--- a/clang/include/clang/Basic/Sanitizers.def
+++ b/clang/include/clang/Basic/Sanitizers.def
@@ -73,6 +73,9 @@ SANITIZER("fuzzer", Fuzzer)
 // libFuzzer-required instrumentation, no linking.
 SANITIZER("fuzzer-no-link", FuzzerNoLink)
 
+// TypeSanitizer
+SANITIZER("type", Type)
+
 // ThreadSanitizer
 SANITIZER("thread", Thread)
 
diff --git a/clang/include/clang/Driver/SanitizerArgs.h 
b/clang/include/clang/Driver/SanitizerArgs.h
index 07070ec4fc0653..52b482a0e8a1a9 100644
--- a/clang/include/clang/Driver/SanitizerArgs.h
+++ b/clang/include/clang/Driver/SanitizerArgs.h
@@ -86,6 +86,7 @@ class SanitizerArgs {
   bool needsHwasanAliasesRt() const {
 return needsHwasanRt() && HwasanUseAliases;
   }
+  bool needsTysanRt() const { return Sanitizers.has(SanitizerKind::Type); }
   bool needsTsanRt() const { return Sanitizers.has(SanitizerKind::Thread); }
   bool needsMsanRt() const { return Sanitizers.has(SanitizerKind::Memory); }
   bool needsFuzzer() const { return Sanitizers.has(SanitizerKind::Fuzzer); }
diff --git a/clang/lib/CodeGen/BackendUtil.cpp 
b/clang/lib/CodeGen/BackendUtil.cpp
index 6cc00b85664f41..1db5aca770b259 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -80,6 +80,7 @@
 #include "llvm/Transforms/Instrumentation/SanitizerBinaryMetadata.h"
 #include "llvm/Transforms/Instrumentation/SanitizerCoverage.h"
 #include "llvm/Transforms/Instrumentation/ThreadSanitizer.h"
+#include "llvm/Transforms/Instrumentation/TypeSanitizer.h"
 #include "llvm/Transforms/ObjCARC.h"
 #include "llvm/Transforms/Scalar/EarlyCSE.h"
 #include "llvm/Transforms/Scalar/GVN.h"
@@ -697,6 +698,11 @@ static void addSanitizers(const Triple &TargetTriple,
   MPM.addPass(createModuleToFunctionPassAdaptor(ThreadSanitizerPass()));
 }
 
+if (LangOpts.Sanitize.has(SanitizerKind::Type)) {
+  MPM.addPass(ModuleTypeSanitizerPass());
+  MPM.addPass(createModuleToFunctionPassAdaptor(TypeSanitizerPass()));
+}
+
 auto ASanPass = [&](SanitizerMask Mask, bool CompileKernel) {
   if (LangOpts.Sanitize.has(Mask)) {
 bool UseGlobalGC = asanUseGlobalsGC(TargetTriple, CodeGenOpts);
diff --git a/clang/lib/CodeGen/CGDecl.cpp b/clang/lib/CodeGen/CGDecl.cpp
index ce6d6d8956076e..42516fa749c830 100644
--- a/clang/lib/CodeGen/CGDecl.cpp
+++ b/clang/lib/CodeGen/CGDecl.cpp
@@ -482,7 +482,8 @@ void CodeGenFunction::EmitStaticVarDecl(const VarDecl &D,
   LocalDeclMap.find(&D)->second = Address(castedAddr, elemTy, alignment);
   CGM.setStaticLocalDeclAddress(&D, castedAddr);
 
-  CGM.getSanitizerMetadata()->reportGlobal(var, D);
+  CGM.getSanitizerMetadata()->reportGlobalToASan(var, D);
+  CGM.getSanitizerMetadata()->reportGlobalToTySan(var, D);
 
   // Emit global variable debug descriptor for static vars.
   CGDebugInfo *DI = getDebugInfo();
diff --git a/clang/lib/CodeGen/CGDeclCXX.cpp b/clang/lib/CodeGen/CGDeclCXX.cpp
index e08a1e5f42df20..08b

[llvm-branch-commits] [clang] [compiler-rt] [llvm] [TySan] A Type Sanitizer (Runtime Library) (PR #76261)

2024-04-19 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/76261
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [compiler-rt] [llvm] [TySan] A Type Sanitizer (Runtime Library) (PR #76261)

2024-04-19 Thread Florian Hahn via llvm-branch-commits


@@ -720,7 +726,7 @@ if(COMPILER_RT_SUPPORTED_ARCH)
 endif()
 message(STATUS "Compiler-RT supported architectures: 
${COMPILER_RT_SUPPORTED_ARCH}")
 
-set(ALL_SANITIZERS 
asan;dfsan;msan;hwasan;tsan;safestack;cfi;scudo_standalone;ubsan_minimal;gwp_asan;asan_abi)
+set(ALL_SANITIZERS 
asan;dfsan;msan;hwasan;tsan;tysan,safestack;cfi;scudo_standalone;ubsan_minimal;gwp_asan;asan_abi)

fhahn wrote:

Thanks, updated!

https://github.com/llvm/llvm-project/pull/76261
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LAA] Support different strides & non constant dep distances using SCEV. (PR #88039)

2024-04-19 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/88039
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LAA] Support different strides & non constant dep distances using SCEV. (PR #88039)

2024-04-19 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/88039

>From 110e5ea24d4b23a153b5f602460b81e5228c700f Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Thu, 4 Apr 2024 12:36:27 +0100
Subject: [PATCH 1/5] [LAA] Support different strides & non constant dep
 distances using SCEV.

Extend LoopAccessAnalysis to support different strides and as a
consequence non-constant distances between dependences using SCEV to
reason about the direction of the dependence.

In multiple places, logic to rule out dependences using the stride has
been updated to only be used if StrideA == StrideB, i.e. there's a
common stride.

We now also may bail out at multiple places where we may have to set
FoundNonConstantDistanceDependence. This is done when we need to bail
out and the distance is not constant to preserve original behavior.

I'd like to call out the changes in global_alias.ll in particular.
In the modified mayAlias01, and mayAlias02 tests, there should be no
aliasing in the original versions of the test, as they are accessing 2
different 100 element fields in a loop with 100 iterations.

I moved the original tests to noAlias15 and noAlias16 respectively,
while updating the original tests to use a variable trip count.

In some cases, like 
different_non_constant_strides_known_backward_min_distance_3,
we now vectorize with runtime checks, even though the runtime checks
will always be false. I'll also share a follow-up patch, that also uses
SCEV to more accurately identify backwards dependences with non-constant
distances.

Fixes https://github.com/llvm/llvm-project/issues/87336
---
 llvm/lib/Analysis/LoopAccessAnalysis.cpp  | 115 --
 .../Transforms/Scalar/LoopLoadElimination.cpp |   4 +-
 .../non-constant-strides-backward.ll  |  90 +-
 .../non-constant-strides-forward.ll   |  10 +-
 .../Transforms/LoopVectorize/global_alias.ll  | 102 ++--
 .../single-iteration-loop-sroa.ll |  40 +-
 6 files changed, 274 insertions(+), 87 deletions(-)

diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp 
b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index c25eede96a1859..314484e11c4a7c 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -1923,8 +1923,9 @@ isLoopVariantIndirectAddress(ArrayRef 
UnderlyingObjects,
 // of various temporary variables, like A/BPtr, StrideA/BPtr and others.
 // Returns either the dependence result, if it could already be determined, or 
a
 // tuple with (Distance, Stride, TypeSize, AIsWrite, BIsWrite).
-static std::variant>
+static std::variant<
+MemoryDepChecker::Dependence::DepType,
+std::tuple>
 getDependenceDistanceStrideAndSize(
 const AccessAnalysis::MemAccessInfo &A, Instruction *AInst,
 const AccessAnalysis::MemAccessInfo &B, Instruction *BInst,
@@ -1982,7 +1983,7 @@ getDependenceDistanceStrideAndSize(
   // Need accesses with constant stride. We don't want to vectorize
   // "A[B[i]] += ..." and similar code or pointer arithmetic that could wrap
   // in the address space.
-  if (!StrideAPtr || !StrideBPtr || StrideAPtr != StrideBPtr) {
+  if (!StrideAPtr || !StrideBPtr) {
 LLVM_DEBUG(dbgs() << "Pointer access with non-constant stride\n");
 return MemoryDepChecker::Dependence::Unknown;
   }
@@ -1992,8 +1993,8 @@ getDependenceDistanceStrideAndSize(
   DL.getTypeStoreSizeInBits(ATy) == DL.getTypeStoreSizeInBits(BTy);
   if (!HasSameSize)
 TypeByteSize = 0;
-  uint64_t Stride = std::abs(StrideAPtr);
-  return std::make_tuple(Dist, Stride, TypeByteSize, AIsWrite, BIsWrite);
+  return std::make_tuple(Dist, std::abs(StrideAPtr), std::abs(StrideBPtr),
+ TypeByteSize, AIsWrite, BIsWrite);
 }
 
 MemoryDepChecker::Dependence::DepType MemoryDepChecker::isDependent(
@@ -2011,68 +2012,108 @@ MemoryDepChecker::Dependence::DepType 
MemoryDepChecker::isDependent(
   if (std::holds_alternative(Res))
 return std::get(Res);
 
-  const auto &[Dist, Stride, TypeByteSize, AIsWrite, BIsWrite] =
-  std::get>(Res);
+  const auto &[Dist, StrideA, StrideB, TypeByteSize, AIsWrite, BIsWrite] =
+  std::get<
+  std::tuple>(
+  Res);
   bool HasSameSize = TypeByteSize > 0;
 
+  uint64_t CommonStride = StrideA == StrideB ? StrideA : 0;
+  if (isa(Dist)) {
+  FoundNonConstantDistanceDependence = true;
+LLVM_DEBUG(dbgs() << "LAA: Dependence because of uncomputable 
distance.\n");
+return Dependence::Unknown;
+  }
+
   ScalarEvolution &SE = *PSE.getSE();
   auto &DL = InnermostLoop->getHeader()->getModule()->getDataLayout();
-  if (!isa(Dist) && HasSameSize &&
+  if (HasSameSize && CommonStride &&
   isSafeDependenceDistance(DL, SE, *(PSE.getBackedgeTakenCount()), *Dist,
-   Stride, TypeByteSize))
+   CommonStride, TypeByteSize))
 return Dependence::NoDep;
 
   const SCEVConstant *C = dyn_cast(Dist);
-  if (!C) {
-LLVM_DEBUG(d

[llvm-branch-commits] [llvm] [LAA] Support different strides & non constant dep distances using SCEV. (PR #88039)

2024-04-19 Thread Florian Hahn via llvm-branch-commits

fhahn wrote:

> > > I would enjoy more textual description of what every condition is meant 
> > > to check.
> > 
> > 
> > There are multiple places that hand off reasoning to called functions, 
> > would you like to have a summary of what the function checks there? Could 
> > do as separate patch, as this would be independent of the current patch?
> 
> I was not familiar with this code, trying to reduce the impact of this patch 
> doesn't help me to understand it and convince myself that it does not cause 
> miscompilation, it rather makes it even more difficult since conditions are 
> now all over the place.
> 

Thanks, I put up https://github.com/llvm/llvm-project/pull/89381 to add extra 
documentation and updated this PR to be based on #89381

https://github.com/llvm/llvm-project/pull/88039
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LAA] Support different strides & non constant dep distances using SCEV. (PR #88039)

2024-04-22 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn closed https://github.com/llvm/llvm-project/pull/88039
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LAA] Support different strides & non constant dep distances using SCEV. (PR #88039)

2024-04-22 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn reopened 
https://github.com/llvm/llvm-project/pull/88039
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [compiler-rt] [llvm] [TySan] A Type Sanitizer (Runtime Library) (PR #76261)

2024-04-22 Thread Florian Hahn via llvm-branch-commits

fhahn wrote:

Added compiler-rt tests for various strict-aliasing violations from the bug 
tracker I found.

https://github.com/llvm/llvm-project/pull/76261
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm][NFC] Document cl::opt variable and fix typo (PR #90670)

2024-05-01 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn approved this pull request.

LGTM, thanks!

For the title, it might be clearer to explicitly mention the variable this is 
documenting, commit title size permitting

https://github.com/llvm/llvm-project/pull/90670
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [FunctionAttrs] Fix incorrect nonnull inference for non-inbounds GEP (#91180) (PR #91286)

2024-05-07 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn approved this pull request.

Should be safe to back port, LGTM, thanks!

https://github.com/llvm/llvm-project/pull/91286
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [LV, LAA] Don't vectorize loops with load and store to invar address. (PR #91092)

2024-05-13 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn approved this pull request.

LG to be back-ported if desired. It fixes an mis-compile but I am not aware of 
any end-to-end reports. I am not sure what the criteria for cherry-picks on the 
release branch are at this point in the release

https://github.com/llvm/llvm-project/pull/91092
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LAA] Use SCEVUse to add extra NUW flags to pointer bounds. (WIP) (PR #91962)

2024-05-13 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn created https://github.com/llvm/llvm-project/pull/91962

Use SCEVUse to add a NUW flag to the upper bound of an accessed pointer.
We must already have proved that the pointers do not wrap, as otherwise
we could not use them for runtime check computations.

By adding the use-specific NUW flag, we can detect cases where SCEV can
prove that the compared pointers must overlap, so the runtime checks
will always be false. In that case, there is no point in vectorizing
with runtime checks.

Note that this depends c2895cd27fbf200d1da056bc66d77eeb62690bf0, which
could be submitted separately if desired; without the current change, I
don't think it triggers in practice though.

Depends on https://github.com/llvm/llvm-project/pull/91961

>From 448c6db95cf89b8f6d007f7049afd02ca21d4427 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Wed, 1 May 2024 11:03:42 +0100
Subject: [PATCH 1/3] [SCEV,LAA] Add tests to make sure scoped SCEVs don't
 impact other SCEVs.

---
 .../LoopAccessAnalysis/scoped-scevs.ll| 182 ++
 1 file changed, 182 insertions(+)
 create mode 100644 llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll

diff --git a/llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll 
b/llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll
new file mode 100644
index 0..323ba2a739cf8
--- /dev/null
+++ b/llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll
@@ -0,0 +1,182 @@
+; NOTE: Assertions have been autogenerated by 
utils/update_analyze_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -passes='print,print' 
-disable-output %s 2>&1 | FileCheck --check-prefixes=LAA,AFTER %s
+; RUN: opt 
-passes='print,print,print' 
-disable-output %s 2>&1 | FileCheck --check-prefixes=BEFORE,LAA,AFTER %s
+
+target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
+
+declare void @use(ptr)
+
+; Check that scoped expressions created by LAA do not interfere with non-scoped
+; SCEVs with the same operands. The tests first run print to
+; populate the SCEV cache. They contain a GEP computing A+405, which is the end
+; of the accessed range, before and/or after the loop. No nuw flags should be
+; added to them in the second print output.
+
+define ptr @test_ptr_range_end_computed_before_and_after_loop(ptr %A) {
+; BEFORE-LABEL: 'test_ptr_range_end_computed_before_and_after_loop'
+; BEFORE-NEXT:  Classifying expressions for: 
@test_ptr_range_end_computed_before_and_after_loop
+; BEFORE:%x = getelementptr inbounds i8, ptr %A, i64 405
+; BEFORE-NEXT:--> (405 + %A) U: full-set S: full-set
+; BEFORE:%y = getelementptr inbounds i8, ptr %A, i64 405
+; BEFORE-NEXT:--> (405 + %A) U: full-set S: full-set
+;
+; LAA-LABEL: 'test_ptr_range_end_computed_before_and_after_loop'
+; LAA-NEXT:loop:
+; LAA-NEXT:  Memory dependences are safe with run-time checks
+; LAA-NEXT:  Dependences:
+; LAA-NEXT:  Run-time memory checks:
+; LAA-NEXT:  Check 0:
+; LAA-NEXT:Comparing group ([[GRP1:0x[0-9a-f]+]]):
+; LAA-NEXT:  %gep.A.400 = getelementptr inbounds i32, ptr %A.1, i64 %iv
+; LAA-NEXT:Against group ([[GRP2:0x[0-9a-f]+]]):
+; LAA-NEXT:  %gep.A = getelementptr inbounds i8, ptr %A, i64 %iv
+; LAA-NEXT:  Grouped accesses:
+; LAA-NEXT:Group [[GRP1]]:
+; LAA-NEXT:  (Low: (1 + %A) High: (405 + %A))
+; LAA-NEXT:Member: {(1 + %A),+,4}<%loop>
+; LAA-NEXT:Group [[GRP2]]:
+; LAA-NEXT:  (Low: %A High: (101 + %A))
+; LAA-NEXT:Member: {%A,+,1}<%loop>
+; LAA-EMPTY:
+; LAA-NEXT:  Non vectorizable stores to invariant address were not found 
in loop.
+; LAA-NEXT:  SCEV assumptions:
+; LAA-EMPTY:
+; LAA-NEXT:  Expressions re-written:
+;
+; AFTER-LABEL: 'test_ptr_range_end_computed_before_and_after_loop'
+; AFTER-NEXT:  Classifying expressions for: 
@test_ptr_range_end_computed_before_and_after_loop
+; AFTER:%x = getelementptr inbounds i8, ptr %A, i64 405
+; AFTER-NEXT:--> (405 + %A) U: full-set S: full-set
+; AFTER:%y = getelementptr inbounds i8, ptr %A, i64 405
+; AFTER-NEXT:--> (405 + %A) U: full-set S: full-set
+entry:
+  %A.1 = getelementptr inbounds i8, ptr %A, i64 1
+  %x = getelementptr inbounds i8, ptr %A, i64 405
+  call void @use(ptr %x)
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
+  %gep.A.400 = getelementptr inbounds i32, ptr %A.1, i64 %iv
+  %gep.A = getelementptr inbounds i8, ptr %A, i64 %iv
+  %l = load i8, ptr %gep.A, align 1
+  %ext = zext i8 %l to i32
+  store i32 %ext, ptr %gep.A.400, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %ec = icmp eq i64 %iv, 100
+  br i1 %ec, label %exit, label %loop
+
+exit:
+  %y = getelementptr inbounds i8, ptr %A, i64 405
+  ret ptr %y
+}
+
+define void @test_ptr_range_end_computed_before_loop(ptr %A) {
+; BEFORE-LABEL: 'test_ptr_range_end_computed_before_loop'
+; BEFORE-NEXT:  Classifying expressions for: 
@test_ptr_range_end_computed_before_loop
+; BEFORE-NEXT:

[llvm-branch-commits] [llvm] [SCEV] Add option to request use-specific SCEV for a GEP expr (WIP). (PR #91964)

2024-05-13 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn created https://github.com/llvm/llvm-project/pull/91964

Use SCEVUse from https://github.com/llvm/llvm-project/pull/91961 to return a 
SCEVUse with use-specific no-wrap flags for GEP expr, when demanded.

Clients need to opt-in, as the use-specific flags may not be valid in some 
contexts (e.g. backedge taken counts).

>From dea2c74ec83f390025cb0389859e472e8676c768 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Sun, 12 May 2024 09:57:54 +0100
Subject: [PATCH] [SCEV] Add option to request use-specific SCEV for a GEP expr
 (WIP).

Use SCEVUse from https://github.com/llvm/llvm-project/pull/91961 to
return a SCEVUse with use-specific no-wrap flags for GEP expr, when
demanded.

Clients need to opt-in, as the use-specific flags may not be valid in
some contexts (e.g. backedge taken counts).
---
 llvm/include/llvm/Analysis/ScalarEvolution.h  | 14 
 llvm/lib/Analysis/ScalarEvolution.cpp | 36 +++
 .../Analysis/ScalarEvolution/min-max-exprs.ll |  2 +-
 .../ScalarEvolution/no-wrap-add-exprs.ll  | 12 +++
 .../no-wrap-symbolic-becount.ll   |  2 +-
 .../test/Analysis/ScalarEvolution/ptrtoint.ll |  4 +--
 llvm/test/Analysis/ScalarEvolution/sdiv.ll|  2 +-
 llvm/test/Analysis/ScalarEvolution/srem.ll|  2 +-
 8 files changed, 41 insertions(+), 33 deletions(-)

diff --git a/llvm/include/llvm/Analysis/ScalarEvolution.h 
b/llvm/include/llvm/Analysis/ScalarEvolution.h
index 2859df9964555..4ca3dbc1c6703 100644
--- a/llvm/include/llvm/Analysis/ScalarEvolution.h
+++ b/llvm/include/llvm/Analysis/ScalarEvolution.h
@@ -653,7 +653,7 @@ class ScalarEvolution {
 
   /// Return a SCEV expression for the full generality of the specified
   /// expression.
-  SCEVUse getSCEV(Value *V);
+  SCEVUse getSCEV(Value *V, bool UseCtx = false);
 
   /// Return an existing SCEV for V if there is one, otherwise return nullptr.
   SCEVUse getExistingSCEV(Value *V);
@@ -735,9 +735,11 @@ class ScalarEvolution {
   /// \p GEP The GEP. The indices contained in the GEP itself are ignored,
   /// instead we use IndexExprs.
   /// \p IndexExprs The expressions for the indices.
-  SCEVUse getGEPExpr(GEPOperator *GEP, ArrayRef IndexExprs);
+  SCEVUse getGEPExpr(GEPOperator *GEP, ArrayRef IndexExprs,
+ bool UseCtx = false);
   SCEVUse getGEPExpr(GEPOperator *GEP,
- const SmallVectorImpl &IndexExprs);
+ const SmallVectorImpl &IndexExprs,
+ bool UseCtx = false);
   SCEVUse getAbsExpr(SCEVUse Op, bool IsNSW);
   SCEVUse getMinMaxExpr(SCEVTypes Kind, ArrayRef Operands);
   SCEVUse getMinMaxExpr(SCEVTypes Kind, SmallVectorImpl &Operands);
@@ -1783,11 +1785,11 @@ class ScalarEvolution {
 
   /// We know that there is no SCEV for the specified value.  Analyze the
   /// expression recursively.
-  SCEVUse createSCEV(Value *V);
+  SCEVUse createSCEV(Value *V, bool UseCtx = false);
 
   /// We know that there is no SCEV for the specified value. Create a new SCEV
   /// for \p V iteratively.
-  SCEVUse createSCEVIter(Value *V);
+  SCEVUse createSCEVIter(Value *V, bool UseCtx = false);
   /// Collect operands of \p V for which SCEV expressions should be constructed
   /// first. Returns a SCEV directly if it can be constructed trivially for \p
   /// V.
@@ -1826,7 +1828,7 @@ class ScalarEvolution {
Value *FalseVal);
 
   /// Provide the special handling we need to analyze GEP SCEVs.
-  SCEVUse createNodeForGEP(GEPOperator *GEP);
+  SCEVUse createNodeForGEP(GEPOperator *GEP, bool UseCtx = false);
 
   /// Implementation code for getSCEVAtScope; called at most once for each
   /// SCEV+Loop pair.
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp 
b/llvm/lib/Analysis/ScalarEvolution.cpp
index 320be6f26fc0a..68f4ddd47d69a 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -2741,6 +2741,8 @@ SCEVUse 
ScalarEvolution::getAddExpr(SmallVectorImpl &Ops,
 break;
   // If we have an add, expand the add operands onto the end of the 
operands
   // list.
+  // CommonFlags = maskFlags(CommonFlags, setFlags(Add->getNoWrapFlags(),
+  // static_cast(Ops[Idx].getInt(;
   Ops.erase(Ops.begin()+Idx);
   append_range(Ops, Add->operands());
   DeletedAdd = true;
@@ -3759,13 +3761,14 @@ SCEVUse 
ScalarEvolution::getAddRecExpr(SmallVectorImpl &Operands,
 }
 
 SCEVUse ScalarEvolution::getGEPExpr(GEPOperator *GEP,
-ArrayRef IndexExprs) {
-  return getGEPExpr(GEP, SmallVector(IndexExprs));
+ArrayRef IndexExprs,
+bool UseCtx) {
+  return getGEPExpr(GEP, SmallVector(IndexExprs), UseCtx);
 }
 
-SCEVUse
-ScalarEvolution::getGEPExpr(GEPOperator *GEP,
-const SmallVectorImpl &IndexExprs) {
+SCEVUse ScalarEvolution::getGEPExpr(GEPOperator *GEP,
+const

[llvm-branch-commits] [llvm] release/18.x: [LV, LAA] Don't vectorize loops with load and store to invar address. (PR #91092)

2024-05-15 Thread Florian Hahn via llvm-branch-commits

fhahn wrote:

SGTM

https://github.com/llvm/llvm-project/pull/91092
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LAA] Use SCEVUse to add extra NUW flags to pointer bounds. (WIP) (PR #91962)

2024-05-22 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/91962

>From ab0311667695fb255625cc846e02373800fad8b1 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Wed, 1 May 2024 11:03:42 +0100
Subject: [PATCH 1/3] [SCEV,LAA] Add tests to make sure scoped SCEVs don't
 impact other SCEVs.

---
 .../LoopAccessAnalysis/scoped-scevs.ll| 182 ++
 1 file changed, 182 insertions(+)
 create mode 100644 llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll

diff --git a/llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll 
b/llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll
new file mode 100644
index 0..323ba2a739cf8
--- /dev/null
+++ b/llvm/test/Analysis/LoopAccessAnalysis/scoped-scevs.ll
@@ -0,0 +1,182 @@
+; NOTE: Assertions have been autogenerated by 
utils/update_analyze_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -passes='print,print' 
-disable-output %s 2>&1 | FileCheck --check-prefixes=LAA,AFTER %s
+; RUN: opt 
-passes='print,print,print' 
-disable-output %s 2>&1 | FileCheck --check-prefixes=BEFORE,LAA,AFTER %s
+
+target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
+
+declare void @use(ptr)
+
+; Check that scoped expressions created by LAA do not interfere with non-scoped
+; SCEVs with the same operands. The tests first run print to
+; populate the SCEV cache. They contain a GEP computing A+405, which is the end
+; of the accessed range, before and/or after the loop. No nuw flags should be
+; added to them in the second print output.
+
+define ptr @test_ptr_range_end_computed_before_and_after_loop(ptr %A) {
+; BEFORE-LABEL: 'test_ptr_range_end_computed_before_and_after_loop'
+; BEFORE-NEXT:  Classifying expressions for: 
@test_ptr_range_end_computed_before_and_after_loop
+; BEFORE:%x = getelementptr inbounds i8, ptr %A, i64 405
+; BEFORE-NEXT:--> (405 + %A) U: full-set S: full-set
+; BEFORE:%y = getelementptr inbounds i8, ptr %A, i64 405
+; BEFORE-NEXT:--> (405 + %A) U: full-set S: full-set
+;
+; LAA-LABEL: 'test_ptr_range_end_computed_before_and_after_loop'
+; LAA-NEXT:loop:
+; LAA-NEXT:  Memory dependences are safe with run-time checks
+; LAA-NEXT:  Dependences:
+; LAA-NEXT:  Run-time memory checks:
+; LAA-NEXT:  Check 0:
+; LAA-NEXT:Comparing group ([[GRP1:0x[0-9a-f]+]]):
+; LAA-NEXT:  %gep.A.400 = getelementptr inbounds i32, ptr %A.1, i64 %iv
+; LAA-NEXT:Against group ([[GRP2:0x[0-9a-f]+]]):
+; LAA-NEXT:  %gep.A = getelementptr inbounds i8, ptr %A, i64 %iv
+; LAA-NEXT:  Grouped accesses:
+; LAA-NEXT:Group [[GRP1]]:
+; LAA-NEXT:  (Low: (1 + %A) High: (405 + %A))
+; LAA-NEXT:Member: {(1 + %A),+,4}<%loop>
+; LAA-NEXT:Group [[GRP2]]:
+; LAA-NEXT:  (Low: %A High: (101 + %A))
+; LAA-NEXT:Member: {%A,+,1}<%loop>
+; LAA-EMPTY:
+; LAA-NEXT:  Non vectorizable stores to invariant address were not found 
in loop.
+; LAA-NEXT:  SCEV assumptions:
+; LAA-EMPTY:
+; LAA-NEXT:  Expressions re-written:
+;
+; AFTER-LABEL: 'test_ptr_range_end_computed_before_and_after_loop'
+; AFTER-NEXT:  Classifying expressions for: 
@test_ptr_range_end_computed_before_and_after_loop
+; AFTER:%x = getelementptr inbounds i8, ptr %A, i64 405
+; AFTER-NEXT:--> (405 + %A) U: full-set S: full-set
+; AFTER:%y = getelementptr inbounds i8, ptr %A, i64 405
+; AFTER-NEXT:--> (405 + %A) U: full-set S: full-set
+entry:
+  %A.1 = getelementptr inbounds i8, ptr %A, i64 1
+  %x = getelementptr inbounds i8, ptr %A, i64 405
+  call void @use(ptr %x)
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
+  %gep.A.400 = getelementptr inbounds i32, ptr %A.1, i64 %iv
+  %gep.A = getelementptr inbounds i8, ptr %A, i64 %iv
+  %l = load i8, ptr %gep.A, align 1
+  %ext = zext i8 %l to i32
+  store i32 %ext, ptr %gep.A.400, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %ec = icmp eq i64 %iv, 100
+  br i1 %ec, label %exit, label %loop
+
+exit:
+  %y = getelementptr inbounds i8, ptr %A, i64 405
+  ret ptr %y
+}
+
+define void @test_ptr_range_end_computed_before_loop(ptr %A) {
+; BEFORE-LABEL: 'test_ptr_range_end_computed_before_loop'
+; BEFORE-NEXT:  Classifying expressions for: 
@test_ptr_range_end_computed_before_loop
+; BEFORE-NEXT:%A.1 = getelementptr inbounds i8, ptr %A, i64 1
+; BEFORE-NEXT:--> (1 + %A) U: full-set S: full-set
+; BEFORE-NEXT:%x = getelementptr inbounds i8, ptr %A, i64 405
+;
+; LAA-LABEL: 'test_ptr_range_end_computed_before_loop'
+; LAA-NEXT:loop:
+; LAA-NEXT:  Memory dependences are safe with run-time checks
+; LAA-NEXT:  Dependences:
+; LAA-NEXT:  Run-time memory checks:
+; LAA-NEXT:  Check 0:
+; LAA-NEXT:Comparing group ([[GRP3:0x[0-9a-f]+]]):
+; LAA-NEXT:  %gep.A.400 = getelementptr inbounds i32, ptr %A.1, i64 %iv
+; LAA-NEXT:Against group ([[GRP4:0x[0-9a-f]+]]):
+; LAA-NEXT:  %gep.A = getelementptr inbounds i8, ptr 

[llvm-branch-commits] [llvm] [LAA] Use getBackedgeTakenCountForCountableExits. (PR #93499)

2024-05-27 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn created https://github.com/llvm/llvm-project/pull/93499

Update LAA to use getBackedgeTakenCountForCountableExits which returns
the minimum of the countable exits

When analyzing dependences and computing runtime checks, we need the
smallest upper bound on the number of iterations. In terms of memory
safety, it shouldn't matter if any uncomputable exits leave the loop,
as long as we prove that there are no dependences given the minimum of
the countable exits. The same should apply also for generating runtime
checks.

Note that this shifts the responsiblity of checking whether all exit
counts are computable or handling early-exits to the users of LAA.

Depends on https://github.com/llvm/llvm-project/pull/93498

>From 80decf5050269fa0e91bf0b397ac9a7565cd6d72 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Wed, 8 May 2024 20:47:29 +0100
Subject: [PATCH] [LAA] Use getBackedgeTakenCountForCountableExits.

Update LAA to use getBackedgeTakenCountForCountableExits which returns
the minimum of the countable exits

When analyzing dependences and computing runtime checks, we need the
smallest upper bound on the number of iterations. In terms of memory
safety, it shouldn't matter if any uncomputable exits leave the loop,
as long as we prove that there are no dependences given the minimum of
the countable exits. The same should apply also for generating runtime
checks.

Note that this shifts the responsiblity of checking whether all exit
counts are computable or handling early-exits to the users of LAA.
---
 llvm/lib/Analysis/LoopAccessAnalysis.cpp  | 11 ++-
 .../Vectorize/LoopVectorizationLegality.cpp   | 10 ++
 .../early-exit-runtime-checks.ll  | 26 -
 .../Transforms/LoopDistribute/early-exit.ll   | 96 +++
 .../Transforms/LoopLoadElim/early-exit.ll | 61 
 5 files changed, 197 insertions(+), 7 deletions(-)
 create mode 100644 llvm/test/Transforms/LoopDistribute/early-exit.ll
 create mode 100644 llvm/test/Transforms/LoopLoadElim/early-exit.ll

diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp 
b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index bc8b9b8479e4f..f15dcaf94ee11 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -214,7 +214,7 @@ getStartAndEndForAccess(const Loop *Lp, const SCEV 
*PtrExpr, Type *AccessTy,
   if (SE->isLoopInvariant(PtrExpr, Lp)) {
 ScStart = ScEnd = PtrExpr;
   } else if (auto *AR = dyn_cast(PtrExpr)) {
-const SCEV *Ex = PSE.getBackedgeTakenCount();
+const SCEV *Ex = PSE.getBackedgeTakenCountForCountableExits();
 
 ScStart = AR->getStart();
 ScEnd = AR->evaluateAtIteration(Ex, *SE);
@@ -2056,8 +2056,9 @@ MemoryDepChecker::Dependence::DepType 
MemoryDepChecker::isDependent(
   // i.e. they are far enough appart that accesses won't access the same
   // location across all loop ierations.
   if (HasSameSize &&
-  isSafeDependenceDistance(DL, SE, *(PSE.getBackedgeTakenCount()), *Dist,
-   MaxStride, TypeByteSize))
+  isSafeDependenceDistance(DL, SE,
+   *(PSE.getBackedgeTakenCountForCountableExits()),
+   *Dist, MaxStride, TypeByteSize))
 return Dependence::NoDep;
 
   const SCEVConstant *C = dyn_cast(Dist);
@@ -2395,7 +2396,7 @@ bool LoopAccessInfo::canAnalyzeLoop() {
   }
 
   // ScalarEvolution needs to be able to find the exit count.
-  const SCEV *ExitCount = PSE->getBackedgeTakenCount();
+  const SCEV *ExitCount = PSE->getBackedgeTakenCountForCountableExits();
   if (isa(ExitCount)) {
 recordAnalysis("CantComputeNumberOfIterations")
 << "could not determine number of loop iterations";
@@ -3004,7 +3005,7 @@ void LoopAccessInfo::collectStridedAccess(Value 
*MemAccess) {
   // of various possible stride specializations, considering the alternatives
   // of using gather/scatters (if available).
 
-  const SCEV *BETakenCount = PSE->getBackedgeTakenCount();
+  const SCEV *BETakenCount = PSE->getBackedgeTakenCountForCountableExits();
 
   // Match the types so we can compare the stride and the BETakenCount.
   // The Stride can be positive/negative, so we sign extend Stride;
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
index 9de49d1bcfeac..0c18c4e146de1 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
@@ -1506,6 +1506,16 @@ bool LoopVectorizationLegality::canVectorize(bool 
UseVPlanNativePath) {
   return false;
   }
 
+  if (isa(PSE.getBackedgeTakenCount())) {
+reportVectorizationFailure("could not determine number of loop iterations",
+   "could not determine number of loop iterations",
+   "CantComputeNumberOfIterations", ORE, TheLoop);
+if (DoExtraAnalysis)
+  Result = false;
+els

[llvm-branch-commits] [llvm] [LAA] Use getBackedgeTakenCountForCountableExits. (PR #93499)

2024-05-27 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/93499

>From 1ce660b45d3706912705bc9e7a8c19e86f05d0c0 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Wed, 8 May 2024 20:47:29 +0100
Subject: [PATCH] [LAA] Use getBackedgeTakenCountForCountableExits.

Update LAA to use getBackedgeTakenCountForCountableExits which returns
the minimum of the countable exits

When analyzing dependences and computing runtime checks, we need the
smallest upper bound on the number of iterations. In terms of memory
safety, it shouldn't matter if any uncomputable exits leave the loop,
as long as we prove that there are no dependences given the minimum of
the countable exits. The same should apply also for generating runtime
checks.

Note that this shifts the responsiblity of checking whether all exit
counts are computable or handling early-exits to the users of LAA.
---
 llvm/lib/Analysis/LoopAccessAnalysis.cpp  | 12 +--
 .../Vectorize/LoopVectorizationLegality.cpp   | 10 ++
 .../early-exit-runtime-checks.ll  | 39 +++-
 .../memcheck-wrapping-pointers.ll | 14 +--
 .../Transforms/LoopDistribute/early-exit.ll   | 96 +++
 .../Transforms/LoopLoadElim/early-exit.ll | 61 
 6 files changed, 216 insertions(+), 16 deletions(-)
 create mode 100644 llvm/test/Transforms/LoopDistribute/early-exit.ll
 create mode 100644 llvm/test/Transforms/LoopLoadElim/early-exit.ll

diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp 
b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index bc8b9b8479e4f..58096b66f704b 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -214,7 +214,7 @@ getStartAndEndForAccess(const Loop *Lp, const SCEV 
*PtrExpr, Type *AccessTy,
   if (SE->isLoopInvariant(PtrExpr, Lp)) {
 ScStart = ScEnd = PtrExpr;
   } else if (auto *AR = dyn_cast(PtrExpr)) {
-const SCEV *Ex = PSE.getBackedgeTakenCount();
+const SCEV *Ex = PSE.getSymbolicMaxBackedgeTakenCount();
 
 ScStart = AR->getStart();
 ScEnd = AR->evaluateAtIteration(Ex, *SE);
@@ -2055,9 +2055,9 @@ MemoryDepChecker::Dependence::DepType 
MemoryDepChecker::isDependent(
   // stride multiplied by the backedge taken count, the accesses are 
independet,
   // i.e. they are far enough appart that accesses won't access the same
   // location across all loop ierations.
-  if (HasSameSize &&
-  isSafeDependenceDistance(DL, SE, *(PSE.getBackedgeTakenCount()), *Dist,
-   MaxStride, TypeByteSize))
+  if (HasSameSize && isSafeDependenceDistance(
+ DL, SE, *(PSE.getSymbolicMaxBackedgeTakenCount()),
+ *Dist, MaxStride, TypeByteSize))
 return Dependence::NoDep;
 
   const SCEVConstant *C = dyn_cast(Dist);
@@ -2395,7 +2395,7 @@ bool LoopAccessInfo::canAnalyzeLoop() {
   }
 
   // ScalarEvolution needs to be able to find the exit count.
-  const SCEV *ExitCount = PSE->getBackedgeTakenCount();
+  const SCEV *ExitCount = PSE->getSymbolicMaxBackedgeTakenCount();
   if (isa(ExitCount)) {
 recordAnalysis("CantComputeNumberOfIterations")
 << "could not determine number of loop iterations";
@@ -3004,7 +3004,7 @@ void LoopAccessInfo::collectStridedAccess(Value 
*MemAccess) {
   // of various possible stride specializations, considering the alternatives
   // of using gather/scatters (if available).
 
-  const SCEV *BETakenCount = PSE->getBackedgeTakenCount();
+  const SCEV *BETakenCount = PSE->getSymbolicMaxBackedgeTakenCount();
 
   // Match the types so we can compare the stride and the BETakenCount.
   // The Stride can be positive/negative, so we sign extend Stride;
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
index 9de49d1bcfeac..0c18c4e146de1 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
@@ -1506,6 +1506,16 @@ bool LoopVectorizationLegality::canVectorize(bool 
UseVPlanNativePath) {
   return false;
   }
 
+  if (isa(PSE.getBackedgeTakenCount())) {
+reportVectorizationFailure("could not determine number of loop iterations",
+   "could not determine number of loop iterations",
+   "CantComputeNumberOfIterations", ORE, TheLoop);
+if (DoExtraAnalysis)
+  Result = false;
+else
+  return false;
+  }
+
   LLVM_DEBUG(dbgs() << "LV: We can vectorize this loop"
 << (LAI->getRuntimePointerChecking()->Need
 ? " (with a runtime bound check)"
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/early-exit-runtime-checks.ll 
b/llvm/test/Analysis/LoopAccessAnalysis/early-exit-runtime-checks.ll
index 0d85f11f06dce..a40aaa8ae99a0 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/early-exit-runtime-checks.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/early-exit-runtim

[llvm-branch-commits] [llvm] [LAA] Use PSE::getSymbolicMaxBackedgeTakenCount. (PR #93499)

2024-05-27 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/93499
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LAA] Use PSE::getSymbolicMaxBackedgeTakenCount. (PR #93499)

2024-05-27 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/93499
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] fd82b5b - [LV] Support recieps without underlying instr in collectPoisonGenRec.

2023-11-03 Thread Florian Hahn via llvm-branch-commits

Author: Florian Hahn
Date: 2023-11-03T10:21:14Z
New Revision: fd82b5b2876b3885b0590ba4538c316fa0e33cf7

URL: 
https://github.com/llvm/llvm-project/commit/fd82b5b2876b3885b0590ba4538c316fa0e33cf7
DIFF: 
https://github.com/llvm/llvm-project/commit/fd82b5b2876b3885b0590ba4538c316fa0e33cf7.diff

LOG: [LV] Support recieps without underlying instr in collectPoisonGenRec.

Support recipes without underlying instruction in
collectPoisonGeneratingRecipes by directly trying to dyn_cast_or_null
the underlying value.

Fixes https://github.com/llvm/llvm-project/issues/70590.

Added: 


Modified: 
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
llvm/test/Transforms/LoopVectorize/X86/drop-poison-generating-flags.ll

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 4f547886f602534..1c208f72af678f7 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -1103,7 +1103,8 @@ void InnerLoopVectorizer::collectPoisonGeneratingRecipes(
   if (auto *RecWithFlags = dyn_cast(CurRec)) {
 RecWithFlags->dropPoisonGeneratingFlags();
   } else {
-Instruction *Instr = CurRec->getUnderlyingInstr();
+Instruction *Instr = dyn_cast_or_null(
+CurRec->getVPSingleValue()->getUnderlyingValue());
 (void)Instr;
 assert((!Instr || !Instr->hasPoisonGeneratingFlags()) &&
"found instruction with poison generating flags not covered by "

diff  --git 
a/llvm/test/Transforms/LoopVectorize/X86/drop-poison-generating-flags.ll 
b/llvm/test/Transforms/LoopVectorize/X86/drop-poison-generating-flags.ll
index b440da6dd866081..5694367dd1f9016 100644
--- a/llvm/test/Transforms/LoopVectorize/X86/drop-poison-generating-flags.ll
+++ b/llvm/test/Transforms/LoopVectorize/X86/drop-poison-generating-flags.ll
@@ -405,6 +405,89 @@ loop.exit:
   ret void
 }
 
+@c = external global [5 x i8]
+
+; Test case for https://github.com/llvm/llvm-project/issues/70590.
+; Note that the then block has UB, but I could not find any other way to
+; construct a suitable test case.
+define void @pr70590_recipe_without_underlying_instr(i64 %n, ptr noalias %dst) 
{
+; CHECK-LABEL: @pr70590_recipe_without_underlying_instr(
+; CHECK:   vector.body:
+; CHECK-NEXT:[[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH:%.+]] ], [ 
[[INDEX_NEXT:%.*]], [[PRED_SREM_CONTINUE6:%.*]] ]
+; CHECK-NEXT:[[VEC_IND:%.*]] = phi <4 x i64> [ , [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], [[PRED_SREM_CONTINUE6]] ]
+; CHECK-NEXT:[[TMP0:%.*]] = add i64 [[INDEX]], 0
+; CHECK-NEXT:[[TMP1:%.*]] = icmp eq <4 x i64> [[VEC_IND]],
+; CHECK-NEXT:[[TMP2:%.*]] = xor <4 x i1> [[TMP1]], 
+; CHECK-NEXT:[[TMP3:%.*]] = extractelement <4 x i1> [[TMP2]], i32 0
+; CHECK-NEXT:br i1 [[TMP3]], label [[PRED_SREM_IF:%.*]], label 
[[PRED_SREM_CONTINUE:%.*]]
+; CHECK:   pred.srem.if:
+; CHECK-NEXT:[[TMP4:%.*]] = srem i64 3, 0
+; CHECK-NEXT:br label [[PRED_SREM_CONTINUE]]
+; CHECK:   pred.srem.continue:
+; CHECK-NEXT:[[TMP5:%.*]] = phi i64 [ poison, %vector.body ], [ [[TMP4]], 
[[PRED_SREM_IF]] ]
+; CHECK-NEXT:[[TMP6:%.*]] = extractelement <4 x i1> [[TMP2]], i32 1
+; CHECK-NEXT:br i1 [[TMP6]], label [[PRED_SREM_IF1:%.*]], label 
[[PRED_SREM_CONTINUE2:%.*]]
+; CHECK:   pred.srem.if1:
+; CHECK-NEXT:[[TMP7:%.*]] = srem i64 3, 0
+; CHECK-NEXT:br label [[PRED_SREM_CONTINUE2]]
+; CHECK:   pred.srem.continue2:
+; CHECK-NEXT:[[TMP8:%.*]] = phi i64 [ poison, [[PRED_SREM_CONTINUE]] ], [ 
[[TMP7]], [[PRED_SREM_IF1]] ]
+; CHECK-NEXT:[[TMP9:%.*]] = extractelement <4 x i1> [[TMP2]], i32 2
+; CHECK-NEXT:br i1 [[TMP9]], label [[PRED_SREM_IF3:%.*]], label 
[[PRED_SREM_CONTINUE4:%.*]]
+; CHECK:   pred.srem.if3:
+; CHECK-NEXT:[[TMP10:%.*]] = srem i64 3, 0
+; CHECK-NEXT:br label [[PRED_SREM_CONTINUE4]]
+; CHECK:   pred.srem.continue4:
+; CHECK-NEXT:[[TMP11:%.*]] = phi i64 [ poison, [[PRED_SREM_CONTINUE2]] ], 
[ [[TMP10]], [[PRED_SREM_IF3]] ]
+; CHECK-NEXT:[[TMP12:%.*]] = extractelement <4 x i1> [[TMP2]], i32 3
+; CHECK-NEXT:br i1 [[TMP12]], label [[PRED_SREM_IF5:%.*]], label 
[[PRED_SREM_CONTINUE6]]
+; CHECK:   pred.srem.if5:
+; CHECK-NEXT:[[TMP13:%.*]] = srem i64 3, 0
+; CHECK-NEXT:br label [[PRED_SREM_CONTINUE6]]
+; CHECK:   pred.srem.continue6:
+; CHECK-NEXT:[[TMP14:%.*]] = phi i64 [ poison, [[PRED_SREM_CONTINUE4]] ], 
[ [[TMP13]], [[PRED_SREM_IF5]] ]
+; CHECK-NEXT:[[TMP15:%.*]] = add i64 [[TMP5]], -3
+; CHECK-NEXT:[[TMP16:%.*]] = add i64 [[TMP0]], [[TMP15]]
+; CHECK-NEXT:[[TMP17:%.*]] = getelementptr [5 x i8], ptr @c, i64 0, i64 
[[TMP16]]
+; CHECK-NEXT:[[TMP18:%.*]] = getelementptr i8, ptr [[TMP17]], i32 0
+; CHECK-NEXT:[[WIDE_LOAD:%.*]] = load <4 x i8>, ptr [[TMP18]], align 1
+; CHECK-NEXT:[[PREDPHI:%.

[llvm-branch-commits] [llvm] 035b334 - [𝘀𝗽𝗿] initial version

2023-11-13 Thread Florian Hahn via llvm-branch-commits

Author: Florian Hahn
Date: 2023-11-13T22:06:01Z
New Revision: 035b334598b4375d4b0682a5ced3f58fcd5a2302

URL: 
https://github.com/llvm/llvm-project/commit/035b334598b4375d4b0682a5ced3f58fcd5a2302
DIFF: 
https://github.com/llvm/llvm-project/commit/035b334598b4375d4b0682a5ced3f58fcd5a2302.diff

LOG: [𝘀𝗽𝗿] initial version

Created using spr 1.3.4

Added: 


Modified: 
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse-mask4.ll
llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll
llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index e9d0315d114f65c..ae8d306c44dd885 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -9525,8 +9525,12 @@ void 
VPWidenMemoryInstructionRecipe::execute(VPTransformState &State) {
   InnerLoopVectorizer::VectorParts BlockInMaskParts(State.UF);
   bool isMaskRequired = getMask();
   if (isMaskRequired)
-for (unsigned Part = 0; Part < State.UF; ++Part)
-  BlockInMaskParts[Part] = State.get(getMask(), Part);
+for (unsigned Part = 0; Part < State.UF; ++Part) {
+  Value *Mask = State.get(getMask(), Part);
+  if (isReverse())
+Mask = Builder.CreateVectorReverse(Mask, "reverse");
+  BlockInMaskParts[Part] = Mask;
+}
 
   const auto CreateVecPtr = [&](unsigned Part, Value *Ptr) -> Value * {
 // Calculate the pointer for the specific unroll-part.
@@ -9558,9 +9562,6 @@ void 
VPWidenMemoryInstructionRecipe::execute(VPTransformState &State) {
   PartPtr = Builder.CreateGEP(ScalarDataTy, Ptr, NumElt, "", InBounds);
   PartPtr =
   Builder.CreateGEP(ScalarDataTy, PartPtr, LastLane, "", InBounds);
-  if (isMaskRequired) // Reverse of a null all-one mask is a null mask.
-BlockInMaskParts[Part] =
-Builder.CreateVectorReverse(BlockInMaskParts[Part], "reverse");
 } else {
   Value *Increment = createStepForVF(Builder, IndexTy, State.VF, Part);
   PartPtr = Builder.CreateGEP(ScalarDataTy, Ptr, Increment, "", InBounds);

diff  --git 
a/llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse-mask4.ll 
b/llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse-mask4.ll
index 58c54103c72c6dc..70833e44b075a98 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse-mask4.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse-mask4.ll
@@ -22,8 +22,8 @@ define void @vector_reverse_mask_nxv4i1(ptr %a, ptr %cond, 
i64 %N) #0 {
 ; CHECK: %[[WIDEMSKLOAD:.*]] = call  
@llvm.masked.load.nxv4f64.p0(ptr %{{.*}}, i32 8,  
%[[REVERSE6]],  poison)
 ; CHECK: %[[REVERSE7:.*]] = call  
@llvm.experimental.vector.reverse.nxv4f64( 
%[[WIDEMSKLOAD]])
 ; CHECK: %[[FADD:.*]] = fadd  %[[REVERSE7]]
+; CHECK: %[[REVERSE9:.*]] = call  
@llvm.experimental.vector.reverse.nxv4i1( %{{.*}})
 ; CHECK: %[[REVERSE8:.*]] = call  
@llvm.experimental.vector.reverse.nxv4f64( %[[FADD]])
-; CHECK:  %[[REVERSE9:.*]] = call  
@llvm.experimental.vector.reverse.nxv4i1( %{{.*}})
 ; CHECK: call void @llvm.masked.store.nxv4f64.p0( 
%[[REVERSE8]], ptr %{{.*}}, i32 8,  %[[REVERSE9]]
 
 entry:

diff  --git 
a/llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll 
b/llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll
index e8e2008912c8344..195826300e3996f 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll
@@ -43,16 +43,16 @@ define void @vector_reverse_mask_v4i1(ptr noalias %a, ptr 
noalias %cond, i64 %N)
 ; CHECK-NEXT:[[TMP5:%.*]] = fcmp une <4 x double> [[REVERSE]], 
zeroinitializer
 ; CHECK-NEXT:[[TMP6:%.*]] = fcmp une <4 x double> [[REVERSE2]], 
zeroinitializer
 ; CHECK-NEXT:[[TMP7:%.*]] = getelementptr double, ptr [[A:%.*]], i64 
[[TMP1]]
-; CHECK-NEXT:[[TMP8:%.*]] = getelementptr double, ptr [[TMP7]], i64 -3
 ; CHECK-NEXT:[[REVERSE3:%.*]] = shufflevector <4 x i1> [[TMP5]], <4 x i1> 
poison, <4 x i32> 
+; CHECK-NEXT:[[REVERSE4:%.*]] = shufflevector <4 x i1> [[TMP6]], <4 x i1> 
poison, <4 x i32> 
+; CHECK-NEXT:[[TMP8:%.*]] = getelementptr double, ptr [[TMP7]], i64 -3
 ; CHECK-NEXT:[[WIDE_MASKED_LOAD:%.*]] = call <4 x double> 
@llvm.masked.load.v4f64.p0(ptr [[TMP8]], i32 8, <4 x i1> [[REVERSE3]], <4 x 
double> poison)
 ; CHECK-NEXT:[[TMP9:%.*]] = getelementptr double, ptr [[TMP7]], i64 -7
-; CHECK-NEXT:[[REVERSE5:%.*]] = shufflevector <4 x i1> [[TMP6]], <4 x i1> 
poison, <4 x i32> 
-; CHECK-NEXT:[[WIDE_MASKED_LOAD6:%.*]] = call <4 x double> 
@llvm.masked.load.v4f64.p0(ptr [[TMP9]], i32 8, <4 x i1> [[REVERSE5]], <4 x 
double> poison)
+; CHECK-NEXT:[[WIDE_MASKED_LOAD6:%.*]

[llvm-branch-commits] [llvm] [VPlan] Model address separately. (PR #72164)

2023-11-13 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn created https://github.com/llvm/llvm-project/pull/72164

Move vector pointer generation to a separate VPInstruction opcode.
This untangles address computation from the memory recipes future
and is also needed to enable explicit unrolling in VPlan.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [VPlan] Model address separately. (PR #72164)

2023-11-13 Thread Florian Hahn via llvm-branch-commits

fhahn wrote:

Note that this patch depends on #72163

https://github.com/llvm/llvm-project/pull/72164
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] e8304de - [𝘀𝗽𝗿] initial version

2023-11-16 Thread Florian Hahn via llvm-branch-commits

Author: Florian Hahn
Date: 2023-11-16T20:48:54Z
New Revision: e8304de86f59fa66bc8af03b401b0c4d28f2ac97

URL: 
https://github.com/llvm/llvm-project/commit/e8304de86f59fa66bc8af03b401b0c4d28f2ac97
DIFF: 
https://github.com/llvm/llvm-project/commit/e8304de86f59fa66bc8af03b401b0c4d28f2ac97.diff

LOG: [𝘀𝗽𝗿] initial version

Created using spr 1.3.4

Added: 


Modified: 
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp 
b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 463a7b5bb1bb588..5859f58a9f462b0 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -4367,8 +4367,7 @@ static bool combineInstructionsOverFunction(
 Function &F, InstructionWorklist &Worklist, AliasAnalysis *AA,
 AssumptionCache &AC, TargetLibraryInfo &TLI, TargetTransformInfo &TTI,
 DominatorTree &DT, OptimizationRemarkEmitter &ORE, BlockFrequencyInfo *BFI,
-ProfileSummaryInfo *PSI, unsigned MaxIterations, bool VerifyFixpoint,
-LoopInfo *LI) {
+ProfileSummaryInfo *PSI, LoopInfo *LI, const InstCombineOptions &Opts) {
   auto &DL = F.getParent()->getDataLayout();
 
   /// Builder - This is an IRBuilder that automatically inserts new
@@ -4394,8 +4393,8 @@ static bool combineInstructionsOverFunction(
   while (true) {
 ++Iteration;
 
-if (Iteration > MaxIterations && !VerifyFixpoint) {
-  LLVM_DEBUG(dbgs() << "\n\n[IC] Iteration limit #" << MaxIterations
+if (Iteration > Opts.MaxIterations && !Opts.VerifyFixpoint) {
+  LLVM_DEBUG(dbgs() << "\n\n[IC] Iteration limit #" << Opts.MaxIterations
 << " on " << F.getName()
 << " reached; stopping without verifying fixpoint\n");
   break;
@@ -4414,10 +4413,10 @@ static bool combineInstructionsOverFunction(
   break;
 
 MadeIRChange = true;
-if (Iteration > MaxIterations) {
+if (Iteration > Opts.MaxIterations) {
   report_fatal_error(
   "Instruction Combining did not reach a fixpoint after " +
-  Twine(MaxIterations) + " iterations");
+  Twine(Opts.MaxIterations) + " iterations");
 }
   }
 
@@ -4468,8 +4467,7 @@ PreservedAnalyses InstCombinePass::run(Function &F,
   &AM.getResult(F) : nullptr;
 
   if (!combineInstructionsOverFunction(F, Worklist, AA, AC, TLI, TTI, DT, ORE,
-   BFI, PSI, Options.MaxIterations,
-   Options.VerifyFixpoint, LI))
+   BFI, PSI, LI, Options))
 // No changes, all analyses are preserved.
 return PreservedAnalyses::all();
 
@@ -4518,9 +4516,7 @@ bool InstructionCombiningPass::runOnFunction(Function &F) 
{
   nullptr;
 
   return combineInstructionsOverFunction(F, Worklist, AA, AC, TLI, TTI, DT, 
ORE,
- BFI, PSI,
- InstCombineDefaultMaxIterations,
- /*VerifyFixpoint */ false, LI);
+ BFI, PSI, LI, InstCombineOptions());
 }
 
 char InstructionCombiningPass::ID = 0;



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 267d656 - [𝘀𝗽𝗿] initial version

2023-11-16 Thread Florian Hahn via llvm-branch-commits

Author: Florian Hahn
Date: 2023-11-16T20:49:01Z
New Revision: 267d656ea58a77622dc512093bf64f50e4a04b95

URL: 
https://github.com/llvm/llvm-project/commit/267d656ea58a77622dc512093bf64f50e4a04b95
DIFF: 
https://github.com/llvm/llvm-project/commit/267d656ea58a77622dc512093bf64f50e4a04b95.diff

LOG: [𝘀𝗽𝗿] initial version

Created using spr 1.3.4

Added: 


Modified: 
llvm/include/llvm/Transforms/InstCombine/InstCombine.h
llvm/lib/Passes/PassBuilder.cpp
llvm/lib/Passes/PassBuilderPipelines.cpp
llvm/lib/Passes/PassRegistry.def
llvm/lib/Transforms/InstCombine/InstCombineInternal.h
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
llvm/test/Transforms/InstCombine/no_sink_instruction.ll
llvm/test/Transforms/PhaseOrdering/AArch64/sinking-vs-if-conversion.ll

Removed: 




diff  --git a/llvm/include/llvm/Transforms/InstCombine/InstCombine.h 
b/llvm/include/llvm/Transforms/InstCombine/InstCombine.h
index f38ec2debb18136..14d1c127984c874 100644
--- a/llvm/include/llvm/Transforms/InstCombine/InstCombine.h
+++ b/llvm/include/llvm/Transforms/InstCombine/InstCombine.h
@@ -32,6 +32,7 @@ struct InstCombineOptions {
   // Verify that a fix point has been reached after MaxIterations.
   bool VerifyFixpoint = false;
   unsigned MaxIterations = InstCombineDefaultMaxIterations;
+  bool EnableCodeSinking = true;
 
   InstCombineOptions() = default;
 
@@ -49,6 +50,11 @@ struct InstCombineOptions {
 MaxIterations = Value;
 return *this;
   }
+
+  InstCombineOptions &setEnableCodeSinking(bool Value) {
+EnableCodeSinking = Value;
+return *this;
+  }
 };
 
 class InstCombinePass : public PassInfoMixin {

diff  --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index dd9d799f9d55dcc..1e79ff660ea3ea2 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -872,6 +872,8 @@ Expected 
parseInstCombineOptions(StringRef Params) {
 ParamName).str(),
 inconvertibleErrorCode());
   Result.setMaxIterations((unsigned)MaxIterations.getZExtValue());
+} else if (ParamName == "code-sinking") {
+  Result.setEnableCodeSinking(Enable);
 } else {
   return make_error(
   formatv("invalid InstCombine pass parameter '{0}' ", 
ParamName).str(),

diff  --git a/llvm/lib/Passes/PassBuilderPipelines.cpp 
b/llvm/lib/Passes/PassBuilderPipelines.cpp
index f3d280316e04077..8946480340d29a9 100644
--- a/llvm/lib/Passes/PassBuilderPipelines.cpp
+++ b/llvm/lib/Passes/PassBuilderPipelines.cpp
@@ -1101,7 +1101,8 @@ 
PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
   FunctionPassManager GlobalCleanupPM;
   // FIXME: Should this instead by a run of SROA?
   GlobalCleanupPM.addPass(PromotePass());
-  GlobalCleanupPM.addPass(InstCombinePass());
+  GlobalCleanupPM.addPass(
+  InstCombinePass(InstCombineOptions().setEnableCodeSinking(false)));
   invokePeepholeEPCallbacks(GlobalCleanupPM, Level);
   GlobalCleanupPM.addPass(
   SimplifyCFGPass(SimplifyCFGOptions().convertSwitchRangeToICmp(true)));

diff  --git a/llvm/lib/Passes/PassRegistry.def 
b/llvm/lib/Passes/PassRegistry.def
index 2067fc473b522db..50dda63578a0add 100644
--- a/llvm/lib/Passes/PassRegistry.def
+++ b/llvm/lib/Passes/PassRegistry.def
@@ -526,7 +526,8 @@ FUNCTION_PASS_WITH_PARAMS("instcombine",
   parseInstCombineOptions,
   "no-use-loop-info;use-loop-info;"
   "no-verify-fixpoint;verify-fixpoint;"
-  "max-iterations=N"
+  "max-iterations=N;"
+  "no-code-sinking;code-sinking"
   )
 FUNCTION_PASS_WITH_PARAMS("mldst-motion",
   "MergedLoadStoreMotionPass",

diff  --git a/llvm/lib/Transforms/InstCombine/InstCombineInternal.h 
b/llvm/lib/Transforms/InstCombine/InstCombineInternal.h
index 68a8fb676d8d909..83364f14ef7db6c 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineInternal.h
+++ b/llvm/lib/Transforms/InstCombine/InstCombineInternal.h
@@ -53,6 +53,7 @@ class DataLayout;
 class DominatorTree;
 class GEPOperator;
 class GlobalVariable;
+struct InstCombineOptions;
 class LoopInfo;
 class OptimizationRemarkEmitter;
 class ProfileSummaryInfo;
@@ -68,9 +69,11 @@ class LLVM_LIBRARY_VISIBILITY InstCombinerImpl final
TargetLibraryInfo &TLI, TargetTransformInfo &TTI,
DominatorTree &DT, OptimizationRemarkEmitter &ORE,
BlockFrequencyInfo *BFI, ProfileSummaryInfo *PSI,
-   const DataLayout &DL, LoopInfo *LI)
+   const DataLayout &DL, LoopInfo *LI,
+   const InstCombineOptions &Opts)
   : InstCombiner(Worklist, Builder, MinimizeSize, AA, AC, TLI, TTI, DT, 
ORE,
- BFI, PSI, DL, LI) {}
+ BFI, PSI, DL

[llvm-branch-commits] [llvm] [Passes] Disable code sinking in InstCombine early on. (PR #72567)

2023-11-16 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn created https://github.com/llvm/llvm-project/pull/72567

Sinking instructions very early in the pipeline destroys
dereferenceability information, that could be used by other passes, e.g.
this can prevent if-conversion by SimplifyCFG.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Passes] Disable code sinking in InstCombine early on. (PR #72567)

2023-11-16 Thread Florian Hahn via llvm-branch-commits

fhahn wrote:

The reason for keeping the original `EnableCodeSinking` option is to retain the 
ability to disable code sinking from the command line

https://github.com/llvm/llvm-project/pull/72567
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Passes] Disable code sinking in InstCombine early on. (PR #72567)

2023-11-17 Thread Florian Hahn via llvm-branch-commits

fhahn wrote:

>  Is the idea here that if the original code executed these unconditionally, 
> then it's more likely than not that unconditionally executing them is 
> beneficial?

I think the main motivation for the patch is the hypothesis  that sinking early 
is worse as canonical form early on, because once we sunk we cannot really undo 
it easily. And once we sunk, we won't be able to consider certain transforms. 

Delaying sinking gives other passes like SimplifyCFG a chance to perform things 
like if-conversion, if considered profitable. There certainly could be 
regressions due to SimplifyCFG's cost model taking a wrong decision but I think 
in those cases it would be better to improve the cost model, rather than 
preventing it up-front by sinking (which isn't cost-model driven at all in 
InstCombine IIRC). It should also be possible to undo if-conversion in the 
backend, if that's more profitable there; at this point, we also arguably have 
much more accurate information about register pressure, available execution 
units, accurate latencys to make a more informed decision.

Slightly orthogonal to this, one thing I want to look into at some point is 
adding a way to specific dereferenceabilty at various program points for 
pointers (e.g. via an intrinsic or assumption). That would ideally allow us to 
retain dereferenceabilty information from the original program, even after 
sinking and would allow if-conversion even after sinking. Avoiding sinking 
early on would probably still be the preferred early canonical form I think.

https://github.com/llvm/llvm-project/pull/72567
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Passes] Disable code sinking in InstCombine early on. (PR #72567)

2023-11-17 Thread Florian Hahn via llvm-branch-commits

fhahn wrote:

The specific test cases came from some users who explicitly wanted to get the 
code there if-converted for the CPU they are targeting. It may not be 
profitable for all targets/CPUs though, so we would still rely on the 
cost-model to take the correct decision per target/CPU.

https://github.com/llvm/llvm-project/pull/72567
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 6f3b88b - [VPlan] Move trunc ([s|z]ext A) simplifications to simplifyRecipe.

2023-11-17 Thread Florian Hahn via llvm-branch-commits

Author: Florian Hahn
Date: 2023-11-16T21:17:10Z
New Revision: 6f3b88baa2ac9ec892ed3ad7dd64d0d537010bc5

URL: 
https://github.com/llvm/llvm-project/commit/6f3b88baa2ac9ec892ed3ad7dd64d0d537010bc5
DIFF: 
https://github.com/llvm/llvm-project/commit/6f3b88baa2ac9ec892ed3ad7dd64d0d537010bc5.diff

LOG: [VPlan] Move trunc ([s|z]ext A) simplifications to simplifyRecipe.

Split off simplification from D149903 as suggested.

This should be effectively NFC until D149903 lands.

Added: 


Modified: 
llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp 
b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index c55864de9c17086..0eaaa037ad5782f 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -816,15 +816,28 @@ static void simplifyRecipe(VPRecipeBase &R, 
VPTypeAnalysis &TypeInfo) {
 break;
   }
   case Instruction::Trunc: {
-VPRecipeBase *Zext = R.getOperand(0)->getDefiningRecipe();
-if (!Zext || getOpcodeForRecipe(*Zext) != Instruction::ZExt)
+VPRecipeBase *Ext = R.getOperand(0)->getDefiningRecipe();
+if (!Ext)
   break;
-VPValue *A = Zext->getOperand(0);
+unsigned ExtOpcode = getOpcodeForRecipe(*Ext);
+if (ExtOpcode != Instruction::ZExt && ExtOpcode != Instruction::SExt)
+  break;
+VPValue *A = Ext->getOperand(0);
 VPValue *Trunc = R.getVPSingleValue();
-Type *TruncToTy = TypeInfo.inferScalarType(Trunc);
-if (TruncToTy && TruncToTy == TypeInfo.inferScalarType(A))
+Type *TruncTy = TypeInfo.inferScalarType(Trunc);
+Type *ATy = TypeInfo.inferScalarType(A);
+if (TruncTy == ATy) {
   Trunc->replaceAllUsesWith(A);
-
+} else if (ATy->getScalarSizeInBits() < TruncTy->getScalarSizeInBits()) {
+  auto *VPC =
+  new VPWidenCastRecipe(Instruction::CastOps(ExtOpcode), A, TruncTy);
+  VPC->insertBefore(&R);
+  Trunc->replaceAllUsesWith(VPC);
+} else if (ATy->getScalarSizeInBits() > TruncTy->getScalarSizeInBits()) {
+  auto *VPC = new VPWidenCastRecipe(Instruction::Trunc, A, TruncTy);
+  VPC->insertBefore(&R);
+  Trunc->replaceAllUsesWith(VPC);
+}
 #ifndef NDEBUG
 // Verify that the cached type info is for both A and its users is still
 // accurate by comparing it to freshly computed types.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [libunwind] [libcxxabi] [libcxx] [compiler-rt] [llvm] [flang] [mlir] [clang-tools-extra] [lldb] [clang] [Passes] Disable code sinking in InstCombine early on. (PR #72567)

2023-11-17 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/72567


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 1f729e0 - [LV] Reverse mask up front, not when creating vector pointer.

2023-11-17 Thread Florian Hahn via llvm-branch-commits

Author: Florian Hahn
Date: 2023-11-17T12:52:06Z
New Revision: 1f729e0bc7a7012fa3a7618100f0865e4e32fc0d

URL: 
https://github.com/llvm/llvm-project/commit/1f729e0bc7a7012fa3a7618100f0865e4e32fc0d
DIFF: 
https://github.com/llvm/llvm-project/commit/1f729e0bc7a7012fa3a7618100f0865e4e32fc0d.diff

LOG: [LV] Reverse mask up front, not when creating vector pointer.

Reverse mask early on when populating BlockInMask. This will enable
separating mask management and address computation from the memory
recipes in the future and is also needed to enable explicit unrolling in
VPlan.

Pull Request: https://github.com/llvm/llvm-project/pull/72163

Added: 


Modified: 
llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll

Removed: 




diff  --git a/llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll 
b/llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll
index d775b0e0f0199c4..5cc4d43ec2e49f5 100644
--- a/llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll
+++ b/llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll
@@ -1515,24 +1515,24 @@ define void @foo6(ptr nocapture readonly %in, ptr 
nocapture %out, i32 %size, ptr
 ; AVX512-NEXT:[[TMP21:%.*]] = getelementptr double, ptr [[IN]], i64 
[[TMP1]]
 ; AVX512-NEXT:[[TMP22:%.*]] = getelementptr double, ptr [[IN]], i64 
[[TMP2]]
 ; AVX512-NEXT:[[TMP23:%.*]] = getelementptr double, ptr [[IN]], i64 
[[TMP3]]
+; AVX512-NEXT:[[REVERSE12:%.*]] = shufflevector <8 x i1> [[TMP16]], <8 x 
i1> poison, <8 x i32> 
+; AVX512-NEXT:[[REVERSE14:%.*]] = shufflevector <8 x i1> [[TMP17]], <8 x 
i1> poison, <8 x i32> 
+; AVX512-NEXT:[[REVERSE17:%.*]] = shufflevector <8 x i1> [[TMP18]], <8 x 
i1> poison, <8 x i32> 
+; AVX512-NEXT:[[REVERSE20:%.*]] = shufflevector <8 x i1> [[TMP19]], <8 x 
i1> poison, <8 x i32> 
 ; AVX512-NEXT:[[TMP24:%.*]] = getelementptr double, ptr [[TMP20]], i32 0
 ; AVX512-NEXT:[[TMP25:%.*]] = getelementptr double, ptr [[TMP24]], i32 -7
-; AVX512-NEXT:[[REVERSE12:%.*]] = shufflevector <8 x i1> [[TMP16]], <8 x 
i1> poison, <8 x i32> 
 ; AVX512-NEXT:[[WIDE_MASKED_LOAD:%.*]] = call <8 x double> 
@llvm.masked.load.v8f64.p0(ptr [[TMP25]], i32 8, <8 x i1> [[REVERSE12]], <8 x 
double> poison), !alias.scope !34
 ; AVX512-NEXT:[[REVERSE13:%.*]] = shufflevector <8 x double> 
[[WIDE_MASKED_LOAD]], <8 x double> poison, <8 x i32> 
 ; AVX512-NEXT:[[TMP26:%.*]] = getelementptr double, ptr [[TMP20]], i32 -8
 ; AVX512-NEXT:[[TMP27:%.*]] = getelementptr double, ptr [[TMP26]], i32 -7
-; AVX512-NEXT:[[REVERSE14:%.*]] = shufflevector <8 x i1> [[TMP17]], <8 x 
i1> poison, <8 x i32> 
 ; AVX512-NEXT:[[WIDE_MASKED_LOAD15:%.*]] = call <8 x double> 
@llvm.masked.load.v8f64.p0(ptr [[TMP27]], i32 8, <8 x i1> [[REVERSE14]], <8 x 
double> poison), !alias.scope !34
 ; AVX512-NEXT:[[REVERSE16:%.*]] = shufflevector <8 x double> 
[[WIDE_MASKED_LOAD15]], <8 x double> poison, <8 x i32> 
 ; AVX512-NEXT:[[TMP28:%.*]] = getelementptr double, ptr [[TMP20]], i32 -16
 ; AVX512-NEXT:[[TMP29:%.*]] = getelementptr double, ptr [[TMP28]], i32 -7
-; AVX512-NEXT:[[REVERSE17:%.*]] = shufflevector <8 x i1> [[TMP18]], <8 x 
i1> poison, <8 x i32> 
 ; AVX512-NEXT:[[WIDE_MASKED_LOAD18:%.*]] = call <8 x double> 
@llvm.masked.load.v8f64.p0(ptr [[TMP29]], i32 8, <8 x i1> [[REVERSE17]], <8 x 
double> poison), !alias.scope !34
 ; AVX512-NEXT:[[REVERSE19:%.*]] = shufflevector <8 x double> 
[[WIDE_MASKED_LOAD18]], <8 x double> poison, <8 x i32> 
 ; AVX512-NEXT:[[TMP30:%.*]] = getelementptr double, ptr [[TMP20]], i32 -24
 ; AVX512-NEXT:[[TMP31:%.*]] = getelementptr double, ptr [[TMP30]], i32 -7
-; AVX512-NEXT:[[REVERSE20:%.*]] = shufflevector <8 x i1> [[TMP19]], <8 x 
i1> poison, <8 x i32> 
 ; AVX512-NEXT:[[WIDE_MASKED_LOAD21:%.*]] = call <8 x double> 
@llvm.masked.load.v8f64.p0(ptr [[TMP31]], i32 8, <8 x i1> [[REVERSE20]], <8 x 
double> poison), !alias.scope !34
 ; AVX512-NEXT:[[REVERSE22:%.*]] = shufflevector <8 x double> 
[[WIDE_MASKED_LOAD21]], <8 x double> poison, <8 x i32> 
 ; AVX512-NEXT:[[TMP32:%.*]] = fadd <8 x double> [[REVERSE13]], 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] e3c1a20 - [𝘀𝗽𝗿] initial version

2023-11-17 Thread Florian Hahn via llvm-branch-commits

Author: Florian Hahn
Date: 2023-11-17T12:52:05Z
New Revision: e3c1a20a1b5a115a75d83f5783ba64927607f427

URL: 
https://github.com/llvm/llvm-project/commit/e3c1a20a1b5a115a75d83f5783ba64927607f427
DIFF: 
https://github.com/llvm/llvm-project/commit/e3c1a20a1b5a115a75d83f5783ba64927607f427.diff

LOG: [𝘀𝗽𝗿] initial version

Created using spr 1.3.4

Added: 


Modified: 
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse-mask4.ll
llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll
llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp 
b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 64093a0db5f81c8..3b41939246b9946 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -9562,8 +9562,12 @@ void 
VPWidenMemoryInstructionRecipe::execute(VPTransformState &State) {
   InnerLoopVectorizer::VectorParts BlockInMaskParts(State.UF);
   bool isMaskRequired = getMask();
   if (isMaskRequired)
-for (unsigned Part = 0; Part < State.UF; ++Part)
-  BlockInMaskParts[Part] = State.get(getMask(), Part);
+for (unsigned Part = 0; Part < State.UF; ++Part) {
+  Value *Mask = State.get(getMask(), Part);
+  if (isReverse())
+Mask = Builder.CreateVectorReverse(Mask, "reverse");
+  BlockInMaskParts[Part] = Mask;
+}
 
   const auto CreateVecPtr = [&](unsigned Part, Value *Ptr) -> Value * {
 // Calculate the pointer for the specific unroll-part.
@@ -9595,9 +9599,6 @@ void 
VPWidenMemoryInstructionRecipe::execute(VPTransformState &State) {
   PartPtr = Builder.CreateGEP(ScalarDataTy, Ptr, NumElt, "", InBounds);
   PartPtr =
   Builder.CreateGEP(ScalarDataTy, PartPtr, LastLane, "", InBounds);
-  if (isMaskRequired) // Reverse of a null all-one mask is a null mask.
-BlockInMaskParts[Part] =
-Builder.CreateVectorReverse(BlockInMaskParts[Part], "reverse");
 } else {
   Value *Increment = createStepForVF(Builder, IndexTy, State.VF, Part);
   PartPtr = Builder.CreateGEP(ScalarDataTy, Ptr, Increment, "", InBounds);

diff  --git 
a/llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse-mask4.ll 
b/llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse-mask4.ll
index 58c54103c72c6dc..70833e44b075a98 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse-mask4.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse-mask4.ll
@@ -22,8 +22,8 @@ define void @vector_reverse_mask_nxv4i1(ptr %a, ptr %cond, 
i64 %N) #0 {
 ; CHECK: %[[WIDEMSKLOAD:.*]] = call  
@llvm.masked.load.nxv4f64.p0(ptr %{{.*}}, i32 8,  
%[[REVERSE6]],  poison)
 ; CHECK: %[[REVERSE7:.*]] = call  
@llvm.experimental.vector.reverse.nxv4f64( 
%[[WIDEMSKLOAD]])
 ; CHECK: %[[FADD:.*]] = fadd  %[[REVERSE7]]
+; CHECK: %[[REVERSE9:.*]] = call  
@llvm.experimental.vector.reverse.nxv4i1( %{{.*}})
 ; CHECK: %[[REVERSE8:.*]] = call  
@llvm.experimental.vector.reverse.nxv4f64( %[[FADD]])
-; CHECK:  %[[REVERSE9:.*]] = call  
@llvm.experimental.vector.reverse.nxv4i1( %{{.*}})
 ; CHECK: call void @llvm.masked.store.nxv4f64.p0( 
%[[REVERSE8]], ptr %{{.*}}, i32 8,  %[[REVERSE9]]
 
 entry:

diff  --git 
a/llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll 
b/llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll
index e8e2008912c8344..195826300e3996f 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse-mask4.ll
@@ -43,16 +43,16 @@ define void @vector_reverse_mask_v4i1(ptr noalias %a, ptr 
noalias %cond, i64 %N)
 ; CHECK-NEXT:[[TMP5:%.*]] = fcmp une <4 x double> [[REVERSE]], 
zeroinitializer
 ; CHECK-NEXT:[[TMP6:%.*]] = fcmp une <4 x double> [[REVERSE2]], 
zeroinitializer
 ; CHECK-NEXT:[[TMP7:%.*]] = getelementptr double, ptr [[A:%.*]], i64 
[[TMP1]]
-; CHECK-NEXT:[[TMP8:%.*]] = getelementptr double, ptr [[TMP7]], i64 -3
 ; CHECK-NEXT:[[REVERSE3:%.*]] = shufflevector <4 x i1> [[TMP5]], <4 x i1> 
poison, <4 x i32> 
+; CHECK-NEXT:[[REVERSE4:%.*]] = shufflevector <4 x i1> [[TMP6]], <4 x i1> 
poison, <4 x i32> 
+; CHECK-NEXT:[[TMP8:%.*]] = getelementptr double, ptr [[TMP7]], i64 -3
 ; CHECK-NEXT:[[WIDE_MASKED_LOAD:%.*]] = call <4 x double> 
@llvm.masked.load.v4f64.p0(ptr [[TMP8]], i32 8, <4 x i1> [[REVERSE3]], <4 x 
double> poison)
 ; CHECK-NEXT:[[TMP9:%.*]] = getelementptr double, ptr [[TMP7]], i64 -7
-; CHECK-NEXT:[[REVERSE5:%.*]] = shufflevector <4 x i1> [[TMP6]], <4 x i1> 
poison, <4 x i32> 
-; CHECK-NEXT:[[WIDE_MASKED_LOAD6:%.*]] = call <4 x double> 
@llvm.masked.load.v4f64.p0(ptr [[TMP9]], i32 8, <4 x i1> [[REVERSE5]], <4 x 
double> poison)
+; CHECK-NEXT:[[WIDE_MASKED_LOAD6:%.*]

[llvm-branch-commits] [mlir] [lldb] [llvm] [libunwind] [libcxx] [clang] [VPlan] Model address separately. (PR #72164)

2023-11-17 Thread Florian Hahn via llvm-branch-commits
=?utf-8?q?Balázs_Kéri?= ,
Endre =?utf-8?q?Fülöp?= ,Michael
 Buch ,Momchil Velikov ,S.
 B. Tam =?utf-8?q?,?=Matthew Devereau
 ,QuietMisdreavus
 ,Anatoly Trosinenko
 ,Diana ,Akash
 Banerjee ,Ivan Kosarev ,Simon
 Pilgrim ,Benjamin Maxwell 
,Simon
 Pilgrim ,David Sherwood
 <57997763+david-...@users.noreply.github.com>,Leandro Lupori
 ,Utkarsh Saxena ,Jeremy Morse
 ,elhewaty ,David Spickett
 ,David Truby ,Alexey Bataev
 ,Momchil Velikov ,Utkarsh
 Saxena ,Timm Baeder ,Aaron Ballman
 ,Qiongsi Wu
 <274595+qiongs...@users.noreply.github.com>,Alexey Bataev
 ,Alex Richardson ,Z572
 ,Youngsuk Kim ,Aart Bik
 <39774503+aart...@users.noreply.github.com>,Hans ,Alexey
 Bataev ,Acim-Maravic
 <119684637+acim-mara...@users.noreply.github.com>,Joseph Huber
 ,androm3da ,Benjamin Kramer
 ,Acim Maravic
 <119684637+acim-mara...@users.noreply.github.com>,Alexey Bataev
 ,Michael Maitland ,Jan
 Patrick Lehr ,Mingming Liu
 ,Fangrui Song ,Walter Erquinigo
 ,Joseph Huber ,Alan Phipps
 ,Timm Baeder ,lntue
 <35648136+ln...@users.noreply.github.com>,Zequan Wu 
,Alexey
 Bataev ,AdityaK
 <1894981+hiradi...@users.noreply.github.com>,Maksim Panchenko 
,Jacek
 Caban ,Benjamin Kramer 
,Leonard
 Chan ,Changpeng Fang ,Jan
 Patrick Lehr ,PiJoules
 <6019989+pijou...@users.noreply.github.com>,Christudasan Devadasan
 ,
Valentin Clement =?utf-8?b?KOODkOODrOODsw=?=,Aart Bik
 <39774503+aart...@users.noreply.github.com>,
Valentin Clement =?utf-8?b?KOODkOODrOODsw=?=,Julian Schmidt
 <44101708+5chmi...@users.noreply.github.com>,Jacek Caban
 ,Max191 
<44243577+max...@users.noreply.github.com>,Nicolas
 van Kempen ,=?utf-8?q?Félix-Antoine?= Constantin,Peter
 Klausler <35819229+klaus...@users.noreply.github.com>,Jake Egan
 <5326451+jakee...@users.noreply.github.com>,Kiran Chandramohan
 ,Kiran Chandramohan ,
Valentin Clement =?utf-8?b?KOODkOODrOODsw=?=,Kiran Chandramohan
 ,Karthika Devi C ,Nikolas
 Klauser ,Maksim Levental
 ,Florian Mayer ,Florian Mayer
 ,Florian Mayer ,Aart Bik
 <39774503+aart...@users.noreply.github.com>,Benjamin Kramer
 ,Vitaly Buka ,Vitaly Buka
 ,Michael Maitland ,Benjamin
 Kramer ,Andres Villegas ,Greg
 Clayton ,Nico Weber ,Jinyang He
 ,Chen Zheng ,Craig Topper
 ,ZhaoQi ,Mircea Trofin
 ,Max191 
<44243577+max...@users.noreply.github.com>,Shengchen
 Kan ,Qiu Chaofan ,Vitaly Buka
 ,Kiran Chandramohan ,Kiran
 Chandramohan ,"Yueh-Ting (eop) Chen"
 ,Shengchen Kan ,Timm
 Baeder ,Timm Baeder ,
Timm =?utf-8?q?Bäder?= ,Michael Buch
 ,Nikita Popov ,Nikita Popov
 ,Michael Buch ,jeanPerier
 ,Luke Lau ,Momchil Velikov
 ,Rik Huijzer ,Kiran
 Chandramohan ,"Yueh-Ting (eop) Chen"
 ,Ben Shi <2283975...@qq.com>,
Stefan =?utf-8?q?Gränitz?= ,Michael Buch
 ,Akash Banerjee ,Jay Foad
 ,Michael Buch ,Michael Buch
 ,Valery Pykhtin ,Jay Foad
 ,Jacek Caban ,chuongg3
 ,Simon Pilgrim ,Simon Pilgrim
 ,Simon Pilgrim ,Alex
 Bradbury ,Timm =?utf-8?q?Bäder?= 
,petar-avramovic
 <56677889+petar-avramo...@users.noreply.github.com>,Kiran Chandramohan
 ,Tavian Barnes ,Yingwei
 Zheng ,serge-sans-paille ,Cullen
 Rhodes ,agozillon ,Alex
 Bradbury ,Alexey Bataev ,David Truby
 ,petar-avramovic
 <56677889+petar-avramo...@users.noreply.github.com>,Florian Hahn
 ,Z572 ,Nikita Popov
 ,Simon Pilgrim ,Simon Pilgrim
 ,Jeremy Morse ,Kiran
 Chandramohan ,Nicolas Vasilache
 ,Bill Wendling
 <5993918+bwendl...@users.noreply.github.com>,Heejin Ahn 
,antoine
 moynault ,Kiran Chandramohan
 ,Michael Buch ,Cyndy
 Ishida ,Craig Topper ,Craig
 Topper ,Alexey Bataev ,Artem
 Belevich ,Craig Topper ,Nico Weber
 ,Yingwei Zheng ,Yingwei Zheng
 ,Brad Smith ,stephenpeckham
 <118857872+stephenpeck...@users.noreply.github.com>,Fangrui Song
 ,Craig Topper ,Craig Topper
 ,quanwanandy
 <150498259+quanwana...@users.noreply.github.com>,Alexey Bataev
 ,Mircea Trofin ,Vladislav
 Khmelevsky ,Rainer Orth ,Tim Harvey
 <146767459+timatgoo...@users.noreply.github.com>,Peiming Liu
 <36770114+peiming...@users.noreply.github.com>,Bjorn Pettersson
 ,Philip Reames ,Augusto
 Noronha ,Fangrui Song ,Aart Bik
 <39774503+aart...@users.noreply.github.com>,Aart Bik
 <39774503+aart...@users.noreply.github.com>,Teresa Johnson
 ,Jake Egan 
<5326451+jakee...@users.noreply.github.com>,Youngsuk
 Kim ,Jason Molenda ,Florian Hahn
 ,Amara Emerson ,Owen Pan
 ,Craig Topper ,Michael
 Maitland ,Owen Pan ,philnik777
 ,David Benjamin ,Michael
 Maitland ,Michael Maitland
 ,Benjamin Kramer 
,PiJoules
 <6019989+pijou...@users.noreply.github.com>,Fangrui Song 
,Amara
 Emerson ,Cyndy Ishida ,Ben Shi
 <2283975...@qq.com>,Ben Shi <2283975...@qq.com>,Michael Kenzel
 <15786918+michael-ken...@users.noreply.github.com>,"Yueh-Ting (eop) Chen"
 ,Matthias Springer ,Matthias Springer
 ,jyu2-git ,LiaoChunyu
 ,Jordan Rupprecht ,Matthias
 Springer ,Jakub Kuderski ,Jinyang He
 ,Uday Bondhugula ,Craig Topper
 ,Aart Bik
 <39774503+aart...@users.noreply.github.com>,Christudasan Devadasan
 ,Timm Baeder ,Timm
 Baeder ,Fangrui Song ,Vladislav Khmelevsky
 ,wenzhi-cui 
<40185576+wenzhi-...@users.noreply.github.com>,Aart
 Bik <39774503+aart...@users.noreply.github.com>,Craig Topp

[llvm-branch-commits] [mlir] [libunwind] [libcxx] [clang-tools-extra] [clang] [llvm] [lldb] [lld] [VPlan] Model address separately. (PR #72164)

2023-11-17 Thread Florian Hahn via llvm-branch-commits
=?utf-8?q?Balázs_Kéri?= ,
Endre =?utf-8?q?Fülöp?= ,Michael
 Buch ,Momchil Velikov ,S.
 B. Tam =?utf-8?q?,?=Matthew Devereau
 ,QuietMisdreavus
 ,Anatoly Trosinenko
 ,Diana ,Akash
 Banerjee ,Ivan Kosarev ,Simon
 Pilgrim ,Benjamin Maxwell 
,Simon
 Pilgrim ,David Sherwood
 <57997763+david-...@users.noreply.github.com>,Leandro Lupori
 ,Utkarsh Saxena ,Jeremy Morse
 ,elhewaty ,David Spickett
 ,David Truby ,Alexey Bataev
 ,Momchil Velikov ,Utkarsh
 Saxena ,Timm Baeder ,Aaron Ballman
 ,Qiongsi Wu
 <274595+qiongs...@users.noreply.github.com>,Alexey Bataev
 ,Alex Richardson ,Z572
 ,Youngsuk Kim ,Aart Bik
 <39774503+aart...@users.noreply.github.com>,Hans ,Alexey
 Bataev ,Acim-Maravic
 <119684637+acim-mara...@users.noreply.github.com>,Joseph Huber
 ,androm3da ,Benjamin Kramer
 ,Acim Maravic
 <119684637+acim-mara...@users.noreply.github.com>,Alexey Bataev
 ,Michael Maitland ,Jan
 Patrick Lehr ,Mingming Liu
 ,Fangrui Song ,Walter Erquinigo
 ,Joseph Huber ,Alan Phipps
 ,Timm Baeder ,lntue
 <35648136+ln...@users.noreply.github.com>,Zequan Wu 
,Alexey
 Bataev ,AdityaK
 <1894981+hiradi...@users.noreply.github.com>,Maksim Panchenko 
,Jacek
 Caban ,Benjamin Kramer 
,Leonard
 Chan ,Changpeng Fang ,Jan
 Patrick Lehr ,PiJoules
 <6019989+pijou...@users.noreply.github.com>,Christudasan Devadasan
 ,
Valentin Clement =?utf-8?b?KOODkOODrOODsw=?=,Aart Bik
 <39774503+aart...@users.noreply.github.com>,
Valentin Clement =?utf-8?b?KOODkOODrOODsw=?=,Julian Schmidt
 <44101708+5chmi...@users.noreply.github.com>,Jacek Caban
 ,Max191 
<44243577+max...@users.noreply.github.com>,Nicolas
 van Kempen ,=?utf-8?q?Félix-Antoine?= Constantin,Peter
 Klausler <35819229+klaus...@users.noreply.github.com>,Jake Egan
 <5326451+jakee...@users.noreply.github.com>,Kiran Chandramohan
 ,Kiran Chandramohan ,
Valentin Clement =?utf-8?b?KOODkOODrOODsw=?=,Kiran Chandramohan
 ,Karthika Devi C ,Nikolas
 Klauser ,Maksim Levental
 ,Florian Mayer ,Florian Mayer
 ,Florian Mayer ,Aart Bik
 <39774503+aart...@users.noreply.github.com>,Benjamin Kramer
 ,Vitaly Buka ,Vitaly Buka
 ,Michael Maitland ,Benjamin
 Kramer ,Andres Villegas ,Greg
 Clayton ,Nico Weber ,Jinyang He
 ,Chen Zheng ,Craig Topper
 ,ZhaoQi ,Mircea Trofin
 ,Max191 
<44243577+max...@users.noreply.github.com>,Shengchen
 Kan ,Qiu Chaofan ,Vitaly Buka
 ,Kiran Chandramohan ,Kiran
 Chandramohan ,"Yueh-Ting (eop) Chen"
 ,Shengchen Kan ,Timm
 Baeder ,Timm Baeder ,
Timm =?utf-8?q?Bäder?= ,Michael Buch
 ,Nikita Popov ,Nikita Popov
 ,Michael Buch ,jeanPerier
 ,Luke Lau ,Momchil Velikov
 ,Rik Huijzer ,Kiran
 Chandramohan ,"Yueh-Ting (eop) Chen"
 ,Ben Shi <2283975...@qq.com>,
Stefan =?utf-8?q?Gränitz?= ,Michael Buch
 ,Akash Banerjee ,Jay Foad
 ,Michael Buch ,Michael Buch
 ,Valery Pykhtin ,Jay Foad
 ,Jacek Caban ,chuongg3
 ,Simon Pilgrim ,Simon Pilgrim
 ,Simon Pilgrim ,Alex
 Bradbury ,Timm =?utf-8?q?Bäder?= 
,petar-avramovic
 <56677889+petar-avramo...@users.noreply.github.com>,Kiran Chandramohan
 ,Tavian Barnes ,Yingwei
 Zheng ,serge-sans-paille ,Cullen
 Rhodes ,agozillon ,Alex
 Bradbury ,Alexey Bataev ,David Truby
 ,petar-avramovic
 <56677889+petar-avramo...@users.noreply.github.com>,Florian Hahn
 ,Z572 ,Nikita Popov
 ,Simon Pilgrim ,Simon Pilgrim
 ,Jeremy Morse ,Kiran
 Chandramohan ,Nicolas Vasilache
 ,Bill Wendling
 <5993918+bwendl...@users.noreply.github.com>,Heejin Ahn 
,antoine
 moynault ,Kiran Chandramohan
 ,Michael Buch ,Cyndy
 Ishida ,Craig Topper ,Craig
 Topper ,Alexey Bataev ,Artem
 Belevich ,Craig Topper ,Nico Weber
 ,Yingwei Zheng ,Yingwei Zheng
 ,Brad Smith ,stephenpeckham
 <118857872+stephenpeck...@users.noreply.github.com>,Fangrui Song
 ,Craig Topper ,Craig Topper
 ,quanwanandy
 <150498259+quanwana...@users.noreply.github.com>,Alexey Bataev
 ,Mircea Trofin ,Vladislav
 Khmelevsky ,Rainer Orth ,Tim Harvey
 <146767459+timatgoo...@users.noreply.github.com>,Peiming Liu
 <36770114+peiming...@users.noreply.github.com>,Bjorn Pettersson
 ,Philip Reames ,Augusto
 Noronha ,Fangrui Song ,Aart Bik
 <39774503+aart...@users.noreply.github.com>,Aart Bik
 <39774503+aart...@users.noreply.github.com>,Teresa Johnson
 ,Jake Egan 
<5326451+jakee...@users.noreply.github.com>,Youngsuk
 Kim ,Jason Molenda ,Florian Hahn
 ,Amara Emerson ,Owen Pan
 ,Craig Topper ,Michael
 Maitland ,Owen Pan ,philnik777
 ,David Benjamin ,Michael
 Maitland ,Michael Maitland
 ,Benjamin Kramer 
,PiJoules
 <6019989+pijou...@users.noreply.github.com>,Fangrui Song 
,Amara
 Emerson ,Cyndy Ishida ,Ben Shi
 <2283975...@qq.com>,Ben Shi <2283975...@qq.com>,Michael Kenzel
 <15786918+michael-ken...@users.noreply.github.com>,"Yueh-Ting (eop) Chen"
 ,Matthias Springer ,Matthias Springer
 ,jyu2-git ,LiaoChunyu
 ,Jordan Rupprecht ,Matthias
 Springer ,Jakub Kuderski ,Jinyang He
 ,Uday Bondhugula ,Craig Topper
 ,Aart Bik
 <39774503+aart...@users.noreply.github.com>,Christudasan Devadasan
 ,Timm Baeder ,Timm
 Baeder ,Fangrui Song ,Vladislav Khmelevsky
 ,wenzhi-cui 
<40185576+wenzhi-...@users.noreply.github.com>,Aart
 Bik <39774503+aart...@users.noreply.github.com>,Craig Topp

[llvm-branch-commits] [llvm] [LivePhysRegs] Add callee-saved regs from MFI in addLiveOutsNoPristines. (PR #73553)

2023-11-27 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/73553
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LivePhysRegs] Add callee-saved regs from MFI in addLiveOutsNoPristines. (PR #73553)

2023-11-27 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/73553
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LivePhysRegs] Add callee-saved regs from MFI in addLiveOutsNoPristines. (PR #73553)

2023-11-27 Thread Florian Hahn via llvm-branch-commits

fhahn wrote:

> Is this about computing **live-outs** of the return block as the code 
> suggests? (The summary currently talks about live-ins?)

Thanks, it should be **live-outs** in the description, updated!

> I don't remember the situation on aarch64, but if by chance LR is modeled 
> with this "pristine register" concept, then maybe the caller needs to use 
> addLiveIns() rather than addLiveInsNoPristines()?

I am not sure, but looking at `updateLiveness` 
(https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/PrologEpilogInserter.cpp#L582)
 it looks like it uses the saved registers from MFI. Pristine registers I think 
contain all callee-saved registers for the target, which may be overestimating 
the liveness quite a bit.



https://github.com/llvm/llvm-project/pull/73553
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] e59b116 - fixup! fix formatting.

2023-11-27 Thread Florian Hahn via llvm-branch-commits

Author: Florian Hahn
Date: 2023-11-27T19:15:33Z
New Revision: e59b116dbc7bf501016c7c68f348c39df8f598a8

URL: 
https://github.com/llvm/llvm-project/commit/e59b116dbc7bf501016c7c68f348c39df8f598a8
DIFF: 
https://github.com/llvm/llvm-project/commit/e59b116dbc7bf501016c7c68f348c39df8f598a8.diff

LOG: fixup! fix formatting.

Added: 


Modified: 
llvm/lib/CodeGen/LivePhysRegs.cpp

Removed: 




diff  --git a/llvm/lib/CodeGen/LivePhysRegs.cpp 
b/llvm/lib/CodeGen/LivePhysRegs.cpp
index 634f46d9d98edc6..20b517b1e1a5c11 100644
--- a/llvm/lib/CodeGen/LivePhysRegs.cpp
+++ b/llvm/lib/CodeGen/LivePhysRegs.cpp
@@ -222,7 +222,7 @@ void LivePhysRegs::addLiveOutsNoPristines(const 
MachineBasicBlock &MBB) {
 const MachineFrameInfo &MFI = MF.getFrameInfo();
 if (MFI.isCalleeSavedInfoValid()) {
   for (const CalleeSavedInfo &Info : MFI.getCalleeSavedInfo())
-  addReg(Info.getReg());
+addReg(Info.getReg());
 }
   }
 }



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LivePhysRegs] Add callee-saved regs from MFI in addLiveOutsNoPristines. (PR #73553)

2023-11-27 Thread Florian Hahn via llvm-branch-commits

fhahn wrote:


> I still feel like I am missing something here, and it's been a while since I 
> looked at this. But my impression is that LLVM modeling is "cheating" a bit 
> in that technically all the callee-saves should be implicit-uses of the 
> return instruction (and not live afterwards) but we don't model it that way 
> and instead make them appear as live-outs of the return block? So doesn't 
> seem like overestimating the liveness because of our current modeling?

Yep, the current modeling in general may overestimates the liveness. With 
overestimating I meant that with this patch we overestimate in more places, but 
that's by design.

> 
> If I'm reading your patch correctly it would mean we would start adding all 
> pristine registers for the return block[1]. I am just confused so far because 
> this is happening in a function called `addLiveOutsNoPristines`...
>

I am not sure what exactly the definition for pristines is (and not super 
familiar with this code in general), and maybe the function name needs to be 
changed. The main thing to note is that it won't add all pristines; 
`addPristines` adds all callee saved registers (via TRI) and removes the ones 
which are in he machine function's CalleeSavedInfo. The patch adds pristines, 
but *only* those that have been added to CalleeSavedInfo.

 
> [1] Pristine == "callee saved but happens to be unused and hence not 
> saved/restored" which is what happens when you remove that 
> `Info.isRestored()` check?


> Which code/pass is using LivePhysRegs that is causing you trouble here?

The issue is in `BranchFolding`, where after splitting, the liveness of `LR` 
isn't preserved in the new or merged blocks. It could be fixed locally there by 
iterating over the registers in CalleeSavedInfo and checking if they were 
live-in in the original block (see below), but I am worried that fixing this 
locally leaves us open for similar issues in other parts of the codebase.

https://github.com/llvm/llvm-project/pull/73553
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LivePhysRegs] Add callee-saved regs from MFI in addLiveOutsNoPristines. (PR #73553)

2023-11-28 Thread Florian Hahn via llvm-branch-commits

fhahn wrote:

> I haven't looked closely to the patch, but I share @MatzeB's concerns here.
> 
> Essentially this patch is reverting https://reviews.llvm.org/D36160, which 
> was fixing a modeling issue with LR on ARM to begin with!

Thanks for sharing the additional context and where `IsRestored` is actually 
set. Taking a look at the original patch, it seems like it doesn't properly 
account for the fact that there could be multiple return blocks which may not 
be using `POP` to restore LR to PC. This could be due to shrink-wrapping + 
tail-calls in some exit blocks (like in 
`outlined-fn-may-clobber-lr-in-caller.ll`) or possibly some return instructions 
not using POP.

IIUC D36160 tries to track LR liveness more accurately and doesn't fix a 
miscompile, but potentially introduced an mis-compile due to underestimating 
liveness of LR.

I don't think the current interface allows to properly check if all exit blocks 
are covered by POP insts in `restoreCalleeSavedRegisters`, as it works on 
single return blocks only.

Without changing the API, we could check if LR is marked as not restored, and 
if it is check if there are multiple return blocks, as sketched in 
https://gist.github.com/fhahn/67937125b64440a8a414909c4a1b7973.

This could be further refined to check if POP could be used for all returns 
(not sure if it is worth it given the small impact on the tests) or the API 
could be changed to pass all return blocks to avoid re-scanning for returns on 
each call (not sure if we should extend the general API even more for this 
workaround though). WDYT

https://github.com/llvm/llvm-project/pull/73553
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LivePhysRegs] Add callee-saved regs from MFI in addLiveOutsNoPristines. (PR #73553)

2023-11-28 Thread Florian Hahn via llvm-branch-commits

fhahn wrote:

@kparzysz please take a loo at 
https://gist.github.com/fhahn/67937125b64440a8a414909c4a1b7973, which has much 
more limited impact. 

> I haven't looked at the updated testcases in detail, but I see that most of 
> the changes are in treating LR as live (whereas it was dead before). At the 
> first glance that doesn't look right.

We might now overestimate liveness, which only results in missed perf correct? 
Although I think this is mostly theoretical at this point, as there's no test 
cases that show that. The issue is that underestimating as we currently do 
leads to incorrect results, in particular with tail calls that use LR 
implicitly. If LR isn't marked as live in that case, other passes are free to 
clobber LR (e.g. the machine-outliner by introducing calls using BL, as in 
https://github.com/llvm/llvm-project/blob/20f634f275b431ff256ba45cbcbb6dc5bd945fb3/llvm/test/CodeGen/Thumb2/outlined-fn-may-clobber-lr-in-caller.ll

https://github.com/llvm/llvm-project/pull/73553
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [VPlan] Initial modeling of runtime VF * UF as VPValue. (PR #74761)

2023-12-07 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn created https://github.com/llvm/llvm-project/pull/74761

This patch starts initial modeling of runtime VF * UF in VPlan.
Initially, introduce a dedicated RuntimeVFxUF VPValue, which is then
populated during VPlan::prepareToExecute. Initially, the runtime
VF * UF applies only to the main vector loop region. Once we extend the
scope of VPlan in the future, we may want to associate different VFxUFs
with different vector loop regions (e.g. the epilogue vector loop)

This allows explicitly parameterizing recipes that rely on the runtime
VF * UF, like the canonical induction increment. At the moment, this
mainly helps to avoid generating some duplicated calls to vscale with
scalable vectors. It should also allow using EVL as induction increments
explicitly in D99750. Referring to VF * UF is also needed in other
places that we plan to migrate to VPlan, like the minimum trip count
check during skeleton creation.

The first version creates the value for VF * UF directly in
prepareToExecute to limit the scope of the patch. A follow-on patch will
model VF * UF computation explicitly in VPlan using recipes.

Moved from Phabricator (https://reviews.llvm.org/D157322)



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [polly] [openmp] [llvm] [flang] [clang] [mlir] [compiler-rt] [lldb] [VPlan] Initial modeling of runtime VF * UF as VPValue. (PR #74761)

2023-12-08 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/74761


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [clang] [lldb] [polly] [compiler-rt] [openmp] [mlir] [lld] [flang] [VPlan] Initial modeling of runtime VF * UF as VPValue. (PR #74761)

2023-12-08 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/74761


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [clang] [lldb] [polly] [compiler-rt] [openmp] [mlir] [lld] [flang] [VPlan] Initial modeling of runtime VF * UF as VPValue. (PR #74761)

2023-12-08 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn updated https://github.com/llvm/llvm-project/pull/74761

>From 6ec44342b09474536d98de55238ee59452c06518 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Fri, 8 Dec 2023 11:00:17 +
Subject: [PATCH] for -> of

Created using spr 1.3.4
---
 llvm/lib/Transforms/Vectorize/VPlan.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h 
b/llvm/lib/Transforms/Vectorize/VPlan.h
index 30e4bf2a226d8..94cb768898136 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -2646,7 +2646,7 @@ class VPlan {
   /// The vector trip count.
   VPValue &getVectorTripCount() { return VectorTripCount; }
 
-  /// Returns VF * UF for the vector loop region.
+  /// Returns VF * UF of the vector loop region.
   VPValue &getVFxUF() { return VFxUF; }
 
   /// Mark the plan to indicate that using Value2VPValue is not safe any

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [polly] [openmp] [llvm] [flang] [clang] [mlir] [compiler-rt] [lldb] [VPlan] Initial modeling of runtime VF * UF as VPValue. (PR #74761)

2023-12-08 Thread Florian Hahn via llvm-branch-commits


@@ -2624,6 +2644,9 @@ class VPlan {
   /// The vector trip count.
   VPValue &getVectorTripCount() { return VectorTripCount; }
 
+  /// Returns runtime VF * UF for the vector loop region.

fhahn wrote:

Done, thanks!

https://github.com/llvm/llvm-project/pull/74761
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [clang] [lldb] [polly] [compiler-rt] [openmp] [mlir] [lld] [flang] [VPlan] Initial modeling of runtime VF * UF as VPValue. (PR #74761)

2023-12-08 Thread Florian Hahn via llvm-branch-commits


@@ -1168,13 +1166,26 @@ class VPInstruction : public VPRecipeWithIRFlags, 
public VPValue {
   return false;
 case VPInstruction::ActiveLaneMask:
 case VPInstruction::CalculateTripCountMinusVF:
-case VPInstruction::CanonicalIVIncrement:
 case VPInstruction::CanonicalIVIncrementForPart:
 case VPInstruction::BranchOnCount:
   return true;
 };
 llvm_unreachable("switch should return");
   }
+

fhahn wrote:

Added, thanks!

https://github.com/llvm/llvm-project/pull/74761
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [clang] [lldb] [polly] [compiler-rt] [openmp] [mlir] [lld] [flang] [VPlan] Initial modeling of runtime VF * UF as VPValue. (PR #74761)

2023-12-08 Thread Florian Hahn via llvm-branch-commits

https://github.com/fhahn edited https://github.com/llvm/llvm-project/pull/74761
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


  1   2   3   4   >