[llvm-branch-commits] [BOLT] Use offset deduplication for cold fragments (PR #87853)

2024-04-15 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov edited https://github.com/llvm/llvm-project/pull/87853
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Use offset deduplication for cold fragments (PR #87853)

2024-04-15 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/87853


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Use offset deduplication for cold fragments (PR #87853)

2024-04-15 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov updated 
https://github.com/llvm/llvm-project/pull/87853


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] de88bd7 - Revert "Fix rsqrt inaccuracies. (#88691)"

2024-04-15 Thread via llvm-branch-commits

Author: Johannes Reifferscheid
Date: 2024-04-15T11:48:12+02:00
New Revision: de88bd7e8925f5df51547e20f6fbd1ef006386ad

URL: 
https://github.com/llvm/llvm-project/commit/de88bd7e8925f5df51547e20f6fbd1ef006386ad
DIFF: 
https://github.com/llvm/llvm-project/commit/de88bd7e8925f5df51547e20f6fbd1ef006386ad.diff

LOG: Revert "Fix rsqrt inaccuracies. (#88691)"

This reverts commit 8ddaf750746d7f9b5f7e878870b086edc0f55326.

Added: 


Modified: 
mlir/lib/Conversion/ComplexToStandard/ComplexToStandard.cpp
mlir/test/Conversion/ComplexToStandard/convert-to-standard.mlir

Removed: 




diff  --git a/mlir/lib/Conversion/ComplexToStandard/ComplexToStandard.cpp 
b/mlir/lib/Conversion/ComplexToStandard/ComplexToStandard.cpp
index 3ebee9baff31bd..49eb575212ffc1 100644
--- a/mlir/lib/Conversion/ComplexToStandard/ComplexToStandard.cpp
+++ b/mlir/lib/Conversion/ComplexToStandard/ComplexToStandard.cpp
@@ -27,11 +27,9 @@ using namespace mlir;
 
 namespace {
 
-enum class AbsFn { abs, sqrt, rsqrt };
-
-// Returns the absolute value, its square root or its reciprocal square root.
+// Returns the absolute value or its square root.
 Value computeAbs(Value real, Value imag, arith::FastMathFlags fmf,
- ImplicitLocOpBuilder &b, AbsFn fn = AbsFn::abs) {
+ ImplicitLocOpBuilder &b, bool returnSqrt = false) {
   Value one = b.create(real.getType(),
   b.getFloatAttr(real.getType(), 1.0));
 
@@ -45,13 +43,7 @@ Value computeAbs(Value real, Value imag, 
arith::FastMathFlags fmf,
   Value ratioSqPlusOne = b.create(ratioSq, one, fmf);
   Value result;
 
-  if (fn == AbsFn::rsqrt) {
-ratioSqPlusOne = b.create(ratioSqPlusOne, fmf);
-min = b.create(min, fmf);
-max = b.create(max, fmf);
-  }
-
-  if (fn == AbsFn::sqrt) {
+  if (returnSqrt) {
 Value quarter = b.create(
 real.getType(), b.getFloatAttr(real.getType(), 0.25));
 // sqrt(sqrt(a*b)) would avoid the pow, but will overflow more easily.
@@ -871,7 +863,7 @@ struct SqrtOpConversion : public 
OpConversionPattern {
 
 Value real = b.create(elementType, adaptor.getComplex());
 Value imag = b.create(elementType, adaptor.getComplex());
-Value absSqrt = computeAbs(real, imag, fmf, b, AbsFn::sqrt);
+Value absSqrt = computeAbs(real, imag, fmf, b, /*returnSqrt=*/true);
 Value argArg = b.create(imag, real, fmf);
 Value sqrtArg = b.create(argArg, half, fmf);
 Value cos = b.create(sqrtArg, fmf);
@@ -1155,74 +1147,18 @@ struct RsqrtOpConversion : public 
OpConversionPattern {
   LogicalResult
   matchAndRewrite(complex::RsqrtOp op, OpAdaptor adaptor,
   ConversionPatternRewriter &rewriter) const override {
-mlir::ImplicitLocOpBuilder b(op.getLoc(), rewriter);
+mlir::ImplicitLocOpBuilder builder(op.getLoc(), rewriter);
 auto type = cast(adaptor.getComplex().getType());
 auto elementType = cast(type.getElementType());
 
-arith::FastMathFlags fmf = op.getFastMathFlagsAttr().getValue();
-
-auto cst = [&](APFloat v) {
-  return b.create(elementType,
- b.getFloatAttr(elementType, v));
-};
-const auto &floatSemantics = elementType.getFloatSemantics();
-Value zero = cst(APFloat::getZero(floatSemantics));
-Value inf = cst(APFloat::getInf(floatSemantics));
-Value negHalf = b.create(
-elementType, b.getFloatAttr(elementType, -0.5));
-Value nan = cst(APFloat::getNaN(floatSemantics));
-
-Value real = b.create(elementType, adaptor.getComplex());
-Value imag = b.create(elementType, adaptor.getComplex());
-Value absRsqrt = computeAbs(real, imag, fmf, b, AbsFn::rsqrt);
-Value argArg = b.create(imag, real, fmf);
-Value rsqrtArg = b.create(argArg, negHalf, fmf);
-Value cos = b.create(rsqrtArg, fmf);
-Value sin = b.create(rsqrtArg, fmf);
-
-Value resultReal = b.create(absRsqrt, cos, fmf);
-Value resultImag = b.create(absRsqrt, sin, fmf);
-
-if (!arith::bitEnumContainsAll(fmf, arith::FastMathFlags::nnan |
-arith::FastMathFlags::ninf)) {
-  Value negOne = b.create(
-  elementType, b.getFloatAttr(elementType, -1));
-
-  Value realSignedZero = b.create(zero, real, fmf);
-  Value imagSignedZero = b.create(zero, imag, fmf);
-  Value negImagSignedZero =
-  b.create(negOne, imagSignedZero, fmf);
+Value c = builder.create(
+elementType, builder.getFloatAttr(elementType, -0.5));
+Value d = builder.create(
+elementType, builder.getFloatAttr(elementType, 0));
 
-  Value absReal = b.create(real, fmf);
-  Value absImag = b.create(imag, fmf);
-
-  Value absImagIsInf =
-  b.create(arith::CmpFPredicate::OEQ, absImag, inf, 
fmf);
-  Value realIsNan =
-  b.create(arith::CmpFPredicate::UNO, real, real, fmf);
-  Value realIsInf =
- 

[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)

2024-04-15 Thread Ilya Biryukov via llvm-branch-commits

ilya-biryukov wrote:

> Can you export all files into a standalone reproducer? I might be able to 
> reduce an example.

Not really, this is why it's taking so long. Our infrastructure in that space 
is lacking, the issue is that the root case is not in one compilation step, but 
rather in some of the dependencies and the dependency graph for any of those 
problems is really large.

We will get you a reproducer, please bear with us. Sorry for taking a long time.

https://github.com/llvm/llvm-project/pull/83237
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)

2024-04-15 Thread Vassil Vassilev via llvm-branch-commits

vgvassilev wrote:

> > Can you export all files into a standalone reproducer? I might be able to 
> > reduce an example.
> 
> Not really, this is why it's taking so long. Our infrastructure in that space 
> is lacking, the issue is that the root case is not in one compilation step, 
> but rather in some of the dependencies and the dependency graph for any of 
> those problems is really large.
> 
> We will get you a reproducer, please bear with us. Sorry for taking a long 
> time.

Back in the day when we were more active developing modules we wrote this tool: 
https://github.com/Teemperor/hippie It hooks to the system calls and copies all 
files involved in a crash/miscompilation under a common sysroot preserving the 
file structure. Maybe that'd help...

https://github.com/llvm/llvm-project/pull/83237
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][Interfaces][WIP] `Variable` abstraction for `ValueBoundsOpInterface` (PR #87980)

2024-04-15 Thread Matthias Springer via llvm-branch-commits

https://github.com/matthias-springer edited 
https://github.com/llvm/llvm-project/pull/87980
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][Interfaces] `Variable` abstraction for `ValueBoundsOpInterface` (PR #87980)

2024-04-15 Thread Matthias Springer via llvm-branch-commits

https://github.com/matthias-springer edited 
https://github.com/llvm/llvm-project/pull/87980
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [Flang][OpenMP][Lower] Refactor lowering of compound constructs (PR #87070)

2024-04-15 Thread Krzysztof Parzyszek via llvm-branch-commits

https://github.com/kparzysz approved this pull request.


https://github.com/llvm/llvm-project/pull/87070
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Add local symbol hint to split fragment name (PR #88627)

2024-04-15 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov closed https://github.com/llvm/llvm-project/pull/88627
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [BOLT] Use BAT to register fragments (PR #87968)

2024-04-15 Thread Amir Ayupov via llvm-branch-commits

https://github.com/aaupov closed https://github.com/llvm/llvm-project/pull/87968
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [LLD] [COFF] Don't add pseudo relocs for dangling references (#88487) (PR #88759)

2024-04-15 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/88759

Backport 9c970d5ecd6a85188cd2b0a941fcd4d60063ef81

Requested by: @mstorsjo

>From e7e348ade66017e29e6982e508ec3f9088a1044b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Martin=20Storsj=C3=B6?= 
Date: Mon, 15 Apr 2024 20:14:07 +0300
Subject: [PATCH] [LLD] [COFF] Don't add pseudo relocs for dangling references
 (#88487)

When doing GC, we normally won't have dangling references, because such
a reference would keep the other section alive, keeping it from being
eliminated.

However, references within DWARF sections are ignored for the purposes
of GC (because otherwise, they would essentially keep everything alive,
defeating the point of the GC), see
c579a5b1d92a9bc2046d00ee2d427832e0f5ddec for more context.

Therefore, dangling relocations against discarded symbols are ignored
within DWARF sections (see maybeReportRelocationToDiscarded in
Chunks.cpp). Consequently, we also shouldn't create any pseudo
relocations for these cases, as we run into a null pointer dereference
when trying to generate the pseudo relocation info for it.

This fixes the downstream bug
https://github.com/mstorsjo/llvm-mingw/issues/418, fixing crashes on
combinations with -ffunction-sections, -fdata-sections,
-Wl,--gc-sections and debug info.

(cherry picked from commit 9c970d5ecd6a85188cd2b0a941fcd4d60063ef81)
---
 lld/COFF/Chunks.cpp   |  7 ++
 lld/test/COFF/autoimport-gc.s | 41 +++
 2 files changed, 48 insertions(+)
 create mode 100644 lld/test/COFF/autoimport-gc.s

diff --git a/lld/COFF/Chunks.cpp b/lld/COFF/Chunks.cpp
index 39f4575031be54..e2074932bc466e 100644
--- a/lld/COFF/Chunks.cpp
+++ b/lld/COFF/Chunks.cpp
@@ -652,6 +652,13 @@ void SectionChunk::getRuntimePseudoRelocs(
 dyn_cast_or_null(file->getSymbol(rel.SymbolTableIndex));
 if (!target || !target->isRuntimePseudoReloc)
   continue;
+// If the target doesn't have a chunk allocated, it may be a
+// DefinedImportData symbol which ended up unnecessary after GC.
+// Normally we wouldn't eliminate section chunks that are referenced, but
+// references within DWARF sections don't count for keeping section chunks
+// alive. Thus such dangling references in DWARF sections are expected.
+if (!target->getChunk())
+  continue;
 int sizeInBits =
 getRuntimePseudoRelocSize(rel.Type, file->ctx.config.machine);
 if (sizeInBits == 0) {
diff --git a/lld/test/COFF/autoimport-gc.s b/lld/test/COFF/autoimport-gc.s
new file mode 100644
index 00..fef6c02eba82f9
--- /dev/null
+++ b/lld/test/COFF/autoimport-gc.s
@@ -0,0 +1,41 @@
+# REQUIRES: x86
+# RUN: split-file %s %t.dir
+
+# RUN: llvm-mc -triple=x86_64-windows-gnu %t.dir/lib.s -filetype=obj -o 
%t.dir/lib.obj
+# RUN: lld-link -out:%t.dir/lib.dll -dll -entry:DllMainCRTStartup 
%t.dir/lib.obj -lldmingw -implib:%t.dir/lib.lib
+
+# RUN: llvm-mc -triple=x86_64-windows-gnu %t.dir/main.s -filetype=obj -o 
%t.dir/main.obj
+# RUN: lld-link -lldmingw -out:%t.dir/main.exe -entry:main %t.dir/main.obj 
%t.dir/lib.lib -opt:ref -debug:dwarf
+
+#--- main.s
+.global main
+.section .text$main,"xr",one_only,main
+main:
+ret
+
+.global other
+.section .text$other,"xr",one_only,other
+other:
+movq .refptr.variable(%rip), %rax
+movl (%rax), %eax
+ret
+
+.section .rdata$.refptr.variable,"dr",discard,.refptr.variable
+.global .refptr.variable
+.refptr.variable:
+.quad   variable
+
+.section .debug_info
+.long 1
+.quad variable
+.long 2
+
+#--- lib.s
+.global variable
+.global DllMainCRTStartup
+.text
+DllMainCRTStartup:
+ret
+.data
+variable:
+.long 42

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [LLD] [COFF] Don't add pseudo relocs for dangling references (#88487) (PR #88759)

2024-04-15 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/88759
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [LLD] [COFF] Don't add pseudo relocs for dangling references (#88487) (PR #88759)

2024-04-15 Thread via llvm-branch-commits

llvmbot wrote:

@cjacek What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/88759
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/18.x: [LLD] [COFF] Don't add pseudo relocs for dangling references (#88487) (PR #88759)

2024-04-15 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-lld

Author: None (llvmbot)


Changes

Backport 9c970d5ecd6a85188cd2b0a941fcd4d60063ef81

Requested by: @mstorsjo

---
Full diff: https://github.com/llvm/llvm-project/pull/88759.diff


2 Files Affected:

- (modified) lld/COFF/Chunks.cpp (+7) 
- (added) lld/test/COFF/autoimport-gc.s (+41) 


``diff
diff --git a/lld/COFF/Chunks.cpp b/lld/COFF/Chunks.cpp
index 39f4575031be54..e2074932bc466e 100644
--- a/lld/COFF/Chunks.cpp
+++ b/lld/COFF/Chunks.cpp
@@ -652,6 +652,13 @@ void SectionChunk::getRuntimePseudoRelocs(
 dyn_cast_or_null(file->getSymbol(rel.SymbolTableIndex));
 if (!target || !target->isRuntimePseudoReloc)
   continue;
+// If the target doesn't have a chunk allocated, it may be a
+// DefinedImportData symbol which ended up unnecessary after GC.
+// Normally we wouldn't eliminate section chunks that are referenced, but
+// references within DWARF sections don't count for keeping section chunks
+// alive. Thus such dangling references in DWARF sections are expected.
+if (!target->getChunk())
+  continue;
 int sizeInBits =
 getRuntimePseudoRelocSize(rel.Type, file->ctx.config.machine);
 if (sizeInBits == 0) {
diff --git a/lld/test/COFF/autoimport-gc.s b/lld/test/COFF/autoimport-gc.s
new file mode 100644
index 00..fef6c02eba82f9
--- /dev/null
+++ b/lld/test/COFF/autoimport-gc.s
@@ -0,0 +1,41 @@
+# REQUIRES: x86
+# RUN: split-file %s %t.dir
+
+# RUN: llvm-mc -triple=x86_64-windows-gnu %t.dir/lib.s -filetype=obj -o 
%t.dir/lib.obj
+# RUN: lld-link -out:%t.dir/lib.dll -dll -entry:DllMainCRTStartup 
%t.dir/lib.obj -lldmingw -implib:%t.dir/lib.lib
+
+# RUN: llvm-mc -triple=x86_64-windows-gnu %t.dir/main.s -filetype=obj -o 
%t.dir/main.obj
+# RUN: lld-link -lldmingw -out:%t.dir/main.exe -entry:main %t.dir/main.obj 
%t.dir/lib.lib -opt:ref -debug:dwarf
+
+#--- main.s
+.global main
+.section .text$main,"xr",one_only,main
+main:
+ret
+
+.global other
+.section .text$other,"xr",one_only,other
+other:
+movq .refptr.variable(%rip), %rax
+movl (%rax), %eax
+ret
+
+.section .rdata$.refptr.variable,"dr",discard,.refptr.variable
+.global .refptr.variable
+.refptr.variable:
+.quad   variable
+
+.section .debug_info
+.long 1
+.quad variable
+.long 2
+
+#--- lib.s
+.global variable
+.global DllMainCRTStartup
+.text
+DllMainCRTStartup:
+ret
+.data
+variable:
+.long 42

``




https://github.com/llvm/llvm-project/pull/88759
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] 82b9a06 - Revert "[clang analysis] ExprMutationAnalyzer avoid infinite recursion for re…"

2024-04-15 Thread via llvm-branch-commits

Author: Florian Mayer
Date: 2024-04-15T10:46:21-07:00
New Revision: 82b9a06f73df5301ffd950775055304124f63e02

URL: 
https://github.com/llvm/llvm-project/commit/82b9a06f73df5301ffd950775055304124f63e02
DIFF: 
https://github.com/llvm/llvm-project/commit/82b9a06f73df5301ffd950775055304124f63e02.diff

LOG: Revert "[clang analysis] ExprMutationAnalyzer avoid infinite recursion for 
re…"

This reverts commit 8095b9ce6bf5831a14c72028920708f38d13d0c3.

Added: 


Modified: 
clang-tools-extra/docs/ReleaseNotes.rst

clang-tools-extra/test/clang-tidy/checkers/misc/const-correctness-templates.cpp
clang/include/clang/Analysis/Analyses/ExprMutationAnalyzer.h
clang/lib/Analysis/ExprMutationAnalyzer.cpp
clang/unittests/Analysis/ExprMutationAnalyzerTest.cpp

Removed: 




diff  --git a/clang-tools-extra/docs/ReleaseNotes.rst 
b/clang-tools-extra/docs/ReleaseNotes.rst
index 7095c56fe6..4dfbd8ca49ab9b 100644
--- a/clang-tools-extra/docs/ReleaseNotes.rst
+++ b/clang-tools-extra/docs/ReleaseNotes.rst
@@ -221,10 +221,6 @@ Changes in existing checks
   ` check by replacing the local
   option `HeaderFileExtensions` by the global option of the same name.
 
-- Improved :doc:`misc-const-correctness
-  ` check by avoiding infinite 
recursion
-  for recursive forwarding reference.
-
 - Improved :doc:`misc-definitions-in-headers
   ` check by replacing the local
   option `HeaderFileExtensions` by the global option of the same name.

diff  --git 
a/clang-tools-extra/test/clang-tidy/checkers/misc/const-correctness-templates.cpp
 
b/clang-tools-extra/test/clang-tidy/checkers/misc/const-correctness-templates.cpp
index 248374a71dd40b..9da468128743e9 100644
--- 
a/clang-tools-extra/test/clang-tidy/checkers/misc/const-correctness-templates.cpp
+++ 
b/clang-tools-extra/test/clang-tidy/checkers/misc/const-correctness-templates.cpp
@@ -58,18 +58,3 @@ void concatenate3(Args... args)
 (..., (stream << args));
 }
 } // namespace gh70323
-
-namespace gh60895 {
-
-template  void f1(T &&a);
-template  void f2(T &&a);
-template  void f1(T &&a) { f2(a); }
-template  void f2(T &&a) { f1(a); }
-void f() {
-  int x = 0;
-  // CHECK-MESSAGES:[[@LINE-1]]:3: warning: variable 'x' of type 'int' can be 
declared 'const'
-  // CHECK-FIXES: int const x = 0;
-  f1(x);
-}
-
-} // namespace gh60895

diff  --git a/clang/include/clang/Analysis/Analyses/ExprMutationAnalyzer.h 
b/clang/include/clang/Analysis/Analyses/ExprMutationAnalyzer.h
index c4e5d0badb8e58..1ceef944fbc34e 100644
--- a/clang/include/clang/Analysis/Analyses/ExprMutationAnalyzer.h
+++ b/clang/include/clang/Analysis/Analyses/ExprMutationAnalyzer.h
@@ -8,10 +8,11 @@
 #ifndef LLVM_CLANG_ANALYSIS_ANALYSES_EXPRMUTATIONANALYZER_H
 #define LLVM_CLANG_ANALYSIS_ANALYSES_EXPRMUTATIONANALYZER_H
 
+#include 
+
 #include "clang/AST/AST.h"
 #include "clang/ASTMatchers/ASTMatchers.h"
 #include "llvm/ADT/DenseMap.h"
-#include 
 
 namespace clang {
 
@@ -21,15 +22,8 @@ class FunctionParmMutationAnalyzer;
 /// a given statement.
 class ExprMutationAnalyzer {
 public:
-  friend class FunctionParmMutationAnalyzer;
-  struct Cache {
-llvm::SmallDenseMap>
-FuncParmAnalyzer;
-  };
-
   ExprMutationAnalyzer(const Stmt &Stm, ASTContext &Context)
-  : ExprMutationAnalyzer(Stm, Context, std::make_shared()) {}
+  : Stm(Stm), Context(Context) {}
 
   bool isMutated(const Expr *Exp) { return findMutation(Exp) != nullptr; }
   bool isMutated(const Decl *Dec) { return findMutation(Dec) != nullptr; }
@@ -51,11 +45,6 @@ class ExprMutationAnalyzer {
   using MutationFinder = const Stmt *(ExprMutationAnalyzer::*)(const Expr *);
   using ResultMap = llvm::DenseMap;
 
-  ExprMutationAnalyzer(const Stmt &Stm, ASTContext &Context,
-   std::shared_ptr CrossAnalysisCache)
-  : Stm(Stm), Context(Context),
-CrossAnalysisCache(std::move(CrossAnalysisCache)) {}
-
   const Stmt *findMutationMemoized(const Expr *Exp,
llvm::ArrayRef Finders,
ResultMap &MemoizedResults);
@@ -80,7 +69,9 @@ class ExprMutationAnalyzer {
 
   const Stmt &Stm;
   ASTContext &Context;
-  std::shared_ptr CrossAnalysisCache;
+  llvm::DenseMap>
+  FuncParmAnalyzer;
   ResultMap Results;
   ResultMap PointeeResults;
 };
@@ -89,12 +80,7 @@ class ExprMutationAnalyzer {
 // params.
 class FunctionParmMutationAnalyzer {
 public:
-  FunctionParmMutationAnalyzer(const FunctionDecl &Func, ASTContext &Context)
-  : FunctionParmMutationAnalyzer(
-Func, Context, std::make_shared()) {}
-  FunctionParmMutationAnalyzer(
-  const FunctionDecl &Func, ASTContext &Context,
-  std::shared_ptr CrossAnalysisCache);
+  FunctionParmMutationAnalyzer(const FunctionDecl &Func, ASTContext &Context);
 
   bool isMutated(const ParmVarDecl *Parm) {
 return findMutation(Parm) != nullptr;

diff  --git a/clang/lib/Analysis/ExprMutationAnaly

[llvm-branch-commits] [lld] release/18.x: [LLD] [COFF] Don't add pseudo relocs for dangling references (#88487) (PR #88759)

2024-04-15 Thread Jacek Caban via llvm-branch-commits

https://github.com/cjacek approved this pull request.


https://github.com/llvm/llvm-project/pull/88759
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [MLIR][XeGPU] Add dpas and named barrier ops (PR #88439)

2024-04-15 Thread Chao Chen via llvm-branch-commits

https://github.com/chencha3 updated 
https://github.com/llvm/llvm-project/pull/88439

>From 6021411059863c9a2bfdfc91e35628328e709a8c Mon Sep 17 00:00:00 2001
From: Chao Chen 
Date: Thu, 11 Apr 2024 15:46:26 -0500
Subject: [PATCH 1/2] Add dpas and named barrier ops

---
 .../mlir/Dialect/XeGPU/IR/CMakeLists.txt  |   6 +-
 mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h|   3 +-
 .../mlir/Dialect/XeGPU/IR/XeGPUAttrs.td   |   1 +
 .../mlir/Dialect/XeGPU/IR/XeGPUDialect.td |   4 +-
 .../include/mlir/Dialect/XeGPU/IR/XeGPUOps.td | 154 +-
 .../mlir/Dialect/XeGPU/IR/XeGPUTypes.td   |  11 ++
 mlir/lib/Dialect/XeGPU/IR/XeGPUOps.cpp|  23 +++
 mlir/test/Dialect/XeGPU/XeGPUOps.mlir |  57 ++-
 8 files changed, 250 insertions(+), 9 deletions(-)

diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt 
b/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt
index f1740e9ed929a6..3f8cac4dc07c3c 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt
@@ -2,12 +2,12 @@ add_mlir_dialect(XeGPU xegpu)
 add_mlir_doc(XeGPU XeGPU Dialects/ -gen-dialect-doc -dialect=xegpu)
 
 set(LLVM_TARGET_DEFINITIONS XeGPU.td)
-mlir_tablegen(XeGPUAttrs.h.inc -gen-attrdef-decls)
-mlir_tablegen(XeGPUAttrs.cpp.inc -gen-attrdef-defs)
+mlir_tablegen(XeGPUAttrs.h.inc -gen-attrdef-decls -attrdefs-dialect=xegpu)
+mlir_tablegen(XeGPUAttrs.cpp.inc -gen-attrdef-defs -attrdefs-dialect=xegpu)
 add_public_tablegen_target(MLIRXeGPUAttrsIncGen)
 add_dependencies(mlir-headers MLIRXeGPUAttrsIncGen)
 
-set(LLVM_TARGET_DEFINITIONS XeGPU.td)
+set(LLVM_TARGET_DEFINITIONS XeGPUAttrs.td)
 mlir_tablegen(XeGPUEnums.h.inc -gen-enum-decls)
 mlir_tablegen(XeGPUEnums.cpp.inc -gen-enum-defs)
 add_public_tablegen_target(MLIRXeGPUEnumsIncGen)
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h
index eca9255ff3974b..7ac0cf77fe59bb 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h
@@ -10,6 +10,7 @@
 #define MLIR_DIALECT_XEGPU_IR_XEGPU_H
 
 #include "mlir/Bytecode/BytecodeOpInterface.h"
+#include "mlir/Dialect/Arith/IR/Arith.h"
 #include "mlir/IR/BuiltinTypes.h"
 #include "mlir/IR/Dialect.h"
 #include "mlir/IR/TypeUtilities.h"
@@ -19,7 +20,7 @@
 
 namespace mlir {
 namespace xegpu {
-// placeholder
+class TensorDescType;
 } // namespace xegpu
 } // namespace mlir
 
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
index 6579d07ec26215..c14cba4990a738 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
@@ -10,6 +10,7 @@
 #define MLIR_DIALECT_XEGPU_IR_XEGPUATTRS_TD
 
 include "mlir/Dialect/XeGPU/IR/XeGPUDialect.td"
+include "mlir/IR/AttrTypeBase.td"
 include "mlir/IR/EnumAttr.td"
 
 class XeGPUAttr traits = [],
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td
index c2f09319c790e0..765f218f95d269 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td
@@ -17,12 +17,14 @@ def XeGPU_Dialect : Dialect {
 let summary = "The XeGPU dialect that models Intel GPU's ISA";
 let description = [{
   The XeGPU dialect models Intel Xe ISA semantics but works at vector and
-  TensorDesc data type. It provides 1:1 mappings to match Xe instructions 
+  TensorDesc data type. It provides 1:1 mappings to match Xe instructions
   like DPAS and 2D block load. The matrix size being processed at this 
level
   exactly matches the hardware instructions or the intrinsic supported by
   the lower-level GPU compiler.
 }];
 
+let dependentDialects = ["arith::ArithDialect"];
+
 let useDefaultTypePrinterParser = true;
 let useDefaultAttributePrinterParser = true;
 }
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
index a031a75984a536..3423609b76c706 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
@@ -9,7 +9,7 @@
 #ifndef MLIR_DIALECT_XEGPU_IR_XEGPUOPS_TD
 #define MLIR_DIALECT_XEGPU_IR_XEGPUOPS_TD
 
-include "mlir/IR/AttrTypeBase.td"
+include "mlir/Dialect/Arith/IR/ArithBase.td"
 include "mlir/Dialect/XeGPU/IR/XeGPUAttrs.td"
 include "mlir/Dialect/XeGPU/IR/XeGPUDialect.td"
 include "mlir/Dialect/XeGPU/IR/XeGPUTypes.td"
@@ -35,7 +35,7 @@ class XeGPU_Op traits = []>:
 
 static ::mlir::ParseResult parseProperties(::mlir::OpAsmParser &parser,
  ::mlir::OperationState &result) {
-  if (mlir::succeeded(parser.parseLess())) {
+  if (mlir::succeeded(parser.parseOptionalLess())) {
 if (parser.parseAttribute(result.propertiesAttr) || 
parser.parseGreater())
   return failure();
   }
@@ -253,7 +253,7 @@ def XeG

[llvm-branch-commits] [mlir] [MLIR][XeGPU] Add dpas and named barrier ops (PR #88439)

2024-04-15 Thread Chao Chen via llvm-branch-commits

https://github.com/chencha3 updated 
https://github.com/llvm/llvm-project/pull/88439

>From 6021411059863c9a2bfdfc91e35628328e709a8c Mon Sep 17 00:00:00 2001
From: Chao Chen 
Date: Thu, 11 Apr 2024 15:46:26 -0500
Subject: [PATCH 1/3] Add dpas and named barrier ops

---
 .../mlir/Dialect/XeGPU/IR/CMakeLists.txt  |   6 +-
 mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h|   3 +-
 .../mlir/Dialect/XeGPU/IR/XeGPUAttrs.td   |   1 +
 .../mlir/Dialect/XeGPU/IR/XeGPUDialect.td |   4 +-
 .../include/mlir/Dialect/XeGPU/IR/XeGPUOps.td | 154 +-
 .../mlir/Dialect/XeGPU/IR/XeGPUTypes.td   |  11 ++
 mlir/lib/Dialect/XeGPU/IR/XeGPUOps.cpp|  23 +++
 mlir/test/Dialect/XeGPU/XeGPUOps.mlir |  57 ++-
 8 files changed, 250 insertions(+), 9 deletions(-)

diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt 
b/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt
index f1740e9ed929a6..3f8cac4dc07c3c 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt
@@ -2,12 +2,12 @@ add_mlir_dialect(XeGPU xegpu)
 add_mlir_doc(XeGPU XeGPU Dialects/ -gen-dialect-doc -dialect=xegpu)
 
 set(LLVM_TARGET_DEFINITIONS XeGPU.td)
-mlir_tablegen(XeGPUAttrs.h.inc -gen-attrdef-decls)
-mlir_tablegen(XeGPUAttrs.cpp.inc -gen-attrdef-defs)
+mlir_tablegen(XeGPUAttrs.h.inc -gen-attrdef-decls -attrdefs-dialect=xegpu)
+mlir_tablegen(XeGPUAttrs.cpp.inc -gen-attrdef-defs -attrdefs-dialect=xegpu)
 add_public_tablegen_target(MLIRXeGPUAttrsIncGen)
 add_dependencies(mlir-headers MLIRXeGPUAttrsIncGen)
 
-set(LLVM_TARGET_DEFINITIONS XeGPU.td)
+set(LLVM_TARGET_DEFINITIONS XeGPUAttrs.td)
 mlir_tablegen(XeGPUEnums.h.inc -gen-enum-decls)
 mlir_tablegen(XeGPUEnums.cpp.inc -gen-enum-defs)
 add_public_tablegen_target(MLIRXeGPUEnumsIncGen)
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h
index eca9255ff3974b..7ac0cf77fe59bb 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h
@@ -10,6 +10,7 @@
 #define MLIR_DIALECT_XEGPU_IR_XEGPU_H
 
 #include "mlir/Bytecode/BytecodeOpInterface.h"
+#include "mlir/Dialect/Arith/IR/Arith.h"
 #include "mlir/IR/BuiltinTypes.h"
 #include "mlir/IR/Dialect.h"
 #include "mlir/IR/TypeUtilities.h"
@@ -19,7 +20,7 @@
 
 namespace mlir {
 namespace xegpu {
-// placeholder
+class TensorDescType;
 } // namespace xegpu
 } // namespace mlir
 
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
index 6579d07ec26215..c14cba4990a738 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
@@ -10,6 +10,7 @@
 #define MLIR_DIALECT_XEGPU_IR_XEGPUATTRS_TD
 
 include "mlir/Dialect/XeGPU/IR/XeGPUDialect.td"
+include "mlir/IR/AttrTypeBase.td"
 include "mlir/IR/EnumAttr.td"
 
 class XeGPUAttr traits = [],
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td
index c2f09319c790e0..765f218f95d269 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td
@@ -17,12 +17,14 @@ def XeGPU_Dialect : Dialect {
 let summary = "The XeGPU dialect that models Intel GPU's ISA";
 let description = [{
   The XeGPU dialect models Intel Xe ISA semantics but works at vector and
-  TensorDesc data type. It provides 1:1 mappings to match Xe instructions 
+  TensorDesc data type. It provides 1:1 mappings to match Xe instructions
   like DPAS and 2D block load. The matrix size being processed at this 
level
   exactly matches the hardware instructions or the intrinsic supported by
   the lower-level GPU compiler.
 }];
 
+let dependentDialects = ["arith::ArithDialect"];
+
 let useDefaultTypePrinterParser = true;
 let useDefaultAttributePrinterParser = true;
 }
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
index a031a75984a536..3423609b76c706 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
@@ -9,7 +9,7 @@
 #ifndef MLIR_DIALECT_XEGPU_IR_XEGPUOPS_TD
 #define MLIR_DIALECT_XEGPU_IR_XEGPUOPS_TD
 
-include "mlir/IR/AttrTypeBase.td"
+include "mlir/Dialect/Arith/IR/ArithBase.td"
 include "mlir/Dialect/XeGPU/IR/XeGPUAttrs.td"
 include "mlir/Dialect/XeGPU/IR/XeGPUDialect.td"
 include "mlir/Dialect/XeGPU/IR/XeGPUTypes.td"
@@ -35,7 +35,7 @@ class XeGPU_Op traits = []>:
 
 static ::mlir::ParseResult parseProperties(::mlir::OpAsmParser &parser,
  ::mlir::OperationState &result) {
-  if (mlir::succeeded(parser.parseLess())) {
+  if (mlir::succeeded(parser.parseOptionalLess())) {
 if (parser.parseAttribute(result.propertiesAttr) || 
parser.parseGreater())
   return failure();
   }
@@ -253,7 +253,7 @@ def XeG

[llvm-branch-commits] [mlir] [MLIR][XeGPU] Add dpas and named barrier ops (PR #88439)

2024-04-15 Thread Chao Chen via llvm-branch-commits

https://github.com/chencha3 updated 
https://github.com/llvm/llvm-project/pull/88439

>From 6021411059863c9a2bfdfc91e35628328e709a8c Mon Sep 17 00:00:00 2001
From: Chao Chen 
Date: Thu, 11 Apr 2024 15:46:26 -0500
Subject: [PATCH 1/4] Add dpas and named barrier ops

---
 .../mlir/Dialect/XeGPU/IR/CMakeLists.txt  |   6 +-
 mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h|   3 +-
 .../mlir/Dialect/XeGPU/IR/XeGPUAttrs.td   |   1 +
 .../mlir/Dialect/XeGPU/IR/XeGPUDialect.td |   4 +-
 .../include/mlir/Dialect/XeGPU/IR/XeGPUOps.td | 154 +-
 .../mlir/Dialect/XeGPU/IR/XeGPUTypes.td   |  11 ++
 mlir/lib/Dialect/XeGPU/IR/XeGPUOps.cpp|  23 +++
 mlir/test/Dialect/XeGPU/XeGPUOps.mlir |  57 ++-
 8 files changed, 250 insertions(+), 9 deletions(-)

diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt 
b/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt
index f1740e9ed929a6..3f8cac4dc07c3c 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt
@@ -2,12 +2,12 @@ add_mlir_dialect(XeGPU xegpu)
 add_mlir_doc(XeGPU XeGPU Dialects/ -gen-dialect-doc -dialect=xegpu)
 
 set(LLVM_TARGET_DEFINITIONS XeGPU.td)
-mlir_tablegen(XeGPUAttrs.h.inc -gen-attrdef-decls)
-mlir_tablegen(XeGPUAttrs.cpp.inc -gen-attrdef-defs)
+mlir_tablegen(XeGPUAttrs.h.inc -gen-attrdef-decls -attrdefs-dialect=xegpu)
+mlir_tablegen(XeGPUAttrs.cpp.inc -gen-attrdef-defs -attrdefs-dialect=xegpu)
 add_public_tablegen_target(MLIRXeGPUAttrsIncGen)
 add_dependencies(mlir-headers MLIRXeGPUAttrsIncGen)
 
-set(LLVM_TARGET_DEFINITIONS XeGPU.td)
+set(LLVM_TARGET_DEFINITIONS XeGPUAttrs.td)
 mlir_tablegen(XeGPUEnums.h.inc -gen-enum-decls)
 mlir_tablegen(XeGPUEnums.cpp.inc -gen-enum-defs)
 add_public_tablegen_target(MLIRXeGPUEnumsIncGen)
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h
index eca9255ff3974b..7ac0cf77fe59bb 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h
@@ -10,6 +10,7 @@
 #define MLIR_DIALECT_XEGPU_IR_XEGPU_H
 
 #include "mlir/Bytecode/BytecodeOpInterface.h"
+#include "mlir/Dialect/Arith/IR/Arith.h"
 #include "mlir/IR/BuiltinTypes.h"
 #include "mlir/IR/Dialect.h"
 #include "mlir/IR/TypeUtilities.h"
@@ -19,7 +20,7 @@
 
 namespace mlir {
 namespace xegpu {
-// placeholder
+class TensorDescType;
 } // namespace xegpu
 } // namespace mlir
 
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
index 6579d07ec26215..c14cba4990a738 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
@@ -10,6 +10,7 @@
 #define MLIR_DIALECT_XEGPU_IR_XEGPUATTRS_TD
 
 include "mlir/Dialect/XeGPU/IR/XeGPUDialect.td"
+include "mlir/IR/AttrTypeBase.td"
 include "mlir/IR/EnumAttr.td"
 
 class XeGPUAttr traits = [],
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td
index c2f09319c790e0..765f218f95d269 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td
@@ -17,12 +17,14 @@ def XeGPU_Dialect : Dialect {
 let summary = "The XeGPU dialect that models Intel GPU's ISA";
 let description = [{
   The XeGPU dialect models Intel Xe ISA semantics but works at vector and
-  TensorDesc data type. It provides 1:1 mappings to match Xe instructions 
+  TensorDesc data type. It provides 1:1 mappings to match Xe instructions
   like DPAS and 2D block load. The matrix size being processed at this 
level
   exactly matches the hardware instructions or the intrinsic supported by
   the lower-level GPU compiler.
 }];
 
+let dependentDialects = ["arith::ArithDialect"];
+
 let useDefaultTypePrinterParser = true;
 let useDefaultAttributePrinterParser = true;
 }
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
index a031a75984a536..3423609b76c706 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
@@ -9,7 +9,7 @@
 #ifndef MLIR_DIALECT_XEGPU_IR_XEGPUOPS_TD
 #define MLIR_DIALECT_XEGPU_IR_XEGPUOPS_TD
 
-include "mlir/IR/AttrTypeBase.td"
+include "mlir/Dialect/Arith/IR/ArithBase.td"
 include "mlir/Dialect/XeGPU/IR/XeGPUAttrs.td"
 include "mlir/Dialect/XeGPU/IR/XeGPUDialect.td"
 include "mlir/Dialect/XeGPU/IR/XeGPUTypes.td"
@@ -35,7 +35,7 @@ class XeGPU_Op traits = []>:
 
 static ::mlir::ParseResult parseProperties(::mlir::OpAsmParser &parser,
  ::mlir::OperationState &result) {
-  if (mlir::succeeded(parser.parseLess())) {
+  if (mlir::succeeded(parser.parseOptionalLess())) {
 if (parser.parseAttribute(result.propertiesAttr) || 
parser.parseGreater())
   return failure();
   }
@@ -253,7 +253,7 @@ def XeG

[llvm-branch-commits] [mlir] [MLIR][XeGPU] Add dpas and named barrier ops (PR #88439)

2024-04-15 Thread Chao Chen via llvm-branch-commits

https://github.com/chencha3 updated 
https://github.com/llvm/llvm-project/pull/88439

>From 6021411059863c9a2bfdfc91e35628328e709a8c Mon Sep 17 00:00:00 2001
From: Chao Chen 
Date: Thu, 11 Apr 2024 15:46:26 -0500
Subject: [PATCH 1/5] Add dpas and named barrier ops

---
 .../mlir/Dialect/XeGPU/IR/CMakeLists.txt  |   6 +-
 mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h|   3 +-
 .../mlir/Dialect/XeGPU/IR/XeGPUAttrs.td   |   1 +
 .../mlir/Dialect/XeGPU/IR/XeGPUDialect.td |   4 +-
 .../include/mlir/Dialect/XeGPU/IR/XeGPUOps.td | 154 +-
 .../mlir/Dialect/XeGPU/IR/XeGPUTypes.td   |  11 ++
 mlir/lib/Dialect/XeGPU/IR/XeGPUOps.cpp|  23 +++
 mlir/test/Dialect/XeGPU/XeGPUOps.mlir |  57 ++-
 8 files changed, 250 insertions(+), 9 deletions(-)

diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt 
b/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt
index f1740e9ed929a6..3f8cac4dc07c3c 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt
@@ -2,12 +2,12 @@ add_mlir_dialect(XeGPU xegpu)
 add_mlir_doc(XeGPU XeGPU Dialects/ -gen-dialect-doc -dialect=xegpu)
 
 set(LLVM_TARGET_DEFINITIONS XeGPU.td)
-mlir_tablegen(XeGPUAttrs.h.inc -gen-attrdef-decls)
-mlir_tablegen(XeGPUAttrs.cpp.inc -gen-attrdef-defs)
+mlir_tablegen(XeGPUAttrs.h.inc -gen-attrdef-decls -attrdefs-dialect=xegpu)
+mlir_tablegen(XeGPUAttrs.cpp.inc -gen-attrdef-defs -attrdefs-dialect=xegpu)
 add_public_tablegen_target(MLIRXeGPUAttrsIncGen)
 add_dependencies(mlir-headers MLIRXeGPUAttrsIncGen)
 
-set(LLVM_TARGET_DEFINITIONS XeGPU.td)
+set(LLVM_TARGET_DEFINITIONS XeGPUAttrs.td)
 mlir_tablegen(XeGPUEnums.h.inc -gen-enum-decls)
 mlir_tablegen(XeGPUEnums.cpp.inc -gen-enum-defs)
 add_public_tablegen_target(MLIRXeGPUEnumsIncGen)
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h
index eca9255ff3974b..7ac0cf77fe59bb 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h
@@ -10,6 +10,7 @@
 #define MLIR_DIALECT_XEGPU_IR_XEGPU_H
 
 #include "mlir/Bytecode/BytecodeOpInterface.h"
+#include "mlir/Dialect/Arith/IR/Arith.h"
 #include "mlir/IR/BuiltinTypes.h"
 #include "mlir/IR/Dialect.h"
 #include "mlir/IR/TypeUtilities.h"
@@ -19,7 +20,7 @@
 
 namespace mlir {
 namespace xegpu {
-// placeholder
+class TensorDescType;
 } // namespace xegpu
 } // namespace mlir
 
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
index 6579d07ec26215..c14cba4990a738 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
@@ -10,6 +10,7 @@
 #define MLIR_DIALECT_XEGPU_IR_XEGPUATTRS_TD
 
 include "mlir/Dialect/XeGPU/IR/XeGPUDialect.td"
+include "mlir/IR/AttrTypeBase.td"
 include "mlir/IR/EnumAttr.td"
 
 class XeGPUAttr traits = [],
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td
index c2f09319c790e0..765f218f95d269 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td
@@ -17,12 +17,14 @@ def XeGPU_Dialect : Dialect {
 let summary = "The XeGPU dialect that models Intel GPU's ISA";
 let description = [{
   The XeGPU dialect models Intel Xe ISA semantics but works at vector and
-  TensorDesc data type. It provides 1:1 mappings to match Xe instructions 
+  TensorDesc data type. It provides 1:1 mappings to match Xe instructions
   like DPAS and 2D block load. The matrix size being processed at this 
level
   exactly matches the hardware instructions or the intrinsic supported by
   the lower-level GPU compiler.
 }];
 
+let dependentDialects = ["arith::ArithDialect"];
+
 let useDefaultTypePrinterParser = true;
 let useDefaultAttributePrinterParser = true;
 }
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
index a031a75984a536..3423609b76c706 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
@@ -9,7 +9,7 @@
 #ifndef MLIR_DIALECT_XEGPU_IR_XEGPUOPS_TD
 #define MLIR_DIALECT_XEGPU_IR_XEGPUOPS_TD
 
-include "mlir/IR/AttrTypeBase.td"
+include "mlir/Dialect/Arith/IR/ArithBase.td"
 include "mlir/Dialect/XeGPU/IR/XeGPUAttrs.td"
 include "mlir/Dialect/XeGPU/IR/XeGPUDialect.td"
 include "mlir/Dialect/XeGPU/IR/XeGPUTypes.td"
@@ -35,7 +35,7 @@ class XeGPU_Op traits = []>:
 
 static ::mlir::ParseResult parseProperties(::mlir::OpAsmParser &parser,
  ::mlir::OperationState &result) {
-  if (mlir::succeeded(parser.parseLess())) {
+  if (mlir::succeeded(parser.parseOptionalLess())) {
 if (parser.parseAttribute(result.propertiesAttr) || 
parser.parseGreater())
   return failure();
   }
@@ -253,7 +253,7 @@ def XeG

[llvm-branch-commits] [mlir] [MLIR][XeGPU] Add dpas and named barrier ops (PR #88439)

2024-04-15 Thread Chao Chen via llvm-branch-commits

https://github.com/chencha3 updated 
https://github.com/llvm/llvm-project/pull/88439

>From 6021411059863c9a2bfdfc91e35628328e709a8c Mon Sep 17 00:00:00 2001
From: Chao Chen 
Date: Thu, 11 Apr 2024 15:46:26 -0500
Subject: [PATCH 1/6] Add dpas and named barrier ops

---
 .../mlir/Dialect/XeGPU/IR/CMakeLists.txt  |   6 +-
 mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h|   3 +-
 .../mlir/Dialect/XeGPU/IR/XeGPUAttrs.td   |   1 +
 .../mlir/Dialect/XeGPU/IR/XeGPUDialect.td |   4 +-
 .../include/mlir/Dialect/XeGPU/IR/XeGPUOps.td | 154 +-
 .../mlir/Dialect/XeGPU/IR/XeGPUTypes.td   |  11 ++
 mlir/lib/Dialect/XeGPU/IR/XeGPUOps.cpp|  23 +++
 mlir/test/Dialect/XeGPU/XeGPUOps.mlir |  57 ++-
 8 files changed, 250 insertions(+), 9 deletions(-)

diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt 
b/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt
index f1740e9ed929a6..3f8cac4dc07c3c 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/CMakeLists.txt
@@ -2,12 +2,12 @@ add_mlir_dialect(XeGPU xegpu)
 add_mlir_doc(XeGPU XeGPU Dialects/ -gen-dialect-doc -dialect=xegpu)
 
 set(LLVM_TARGET_DEFINITIONS XeGPU.td)
-mlir_tablegen(XeGPUAttrs.h.inc -gen-attrdef-decls)
-mlir_tablegen(XeGPUAttrs.cpp.inc -gen-attrdef-defs)
+mlir_tablegen(XeGPUAttrs.h.inc -gen-attrdef-decls -attrdefs-dialect=xegpu)
+mlir_tablegen(XeGPUAttrs.cpp.inc -gen-attrdef-defs -attrdefs-dialect=xegpu)
 add_public_tablegen_target(MLIRXeGPUAttrsIncGen)
 add_dependencies(mlir-headers MLIRXeGPUAttrsIncGen)
 
-set(LLVM_TARGET_DEFINITIONS XeGPU.td)
+set(LLVM_TARGET_DEFINITIONS XeGPUAttrs.td)
 mlir_tablegen(XeGPUEnums.h.inc -gen-enum-decls)
 mlir_tablegen(XeGPUEnums.cpp.inc -gen-enum-defs)
 add_public_tablegen_target(MLIRXeGPUEnumsIncGen)
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h
index eca9255ff3974b..7ac0cf77fe59bb 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPU.h
@@ -10,6 +10,7 @@
 #define MLIR_DIALECT_XEGPU_IR_XEGPU_H
 
 #include "mlir/Bytecode/BytecodeOpInterface.h"
+#include "mlir/Dialect/Arith/IR/Arith.h"
 #include "mlir/IR/BuiltinTypes.h"
 #include "mlir/IR/Dialect.h"
 #include "mlir/IR/TypeUtilities.h"
@@ -19,7 +20,7 @@
 
 namespace mlir {
 namespace xegpu {
-// placeholder
+class TensorDescType;
 } // namespace xegpu
 } // namespace mlir
 
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
index 6579d07ec26215..c14cba4990a738 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
@@ -10,6 +10,7 @@
 #define MLIR_DIALECT_XEGPU_IR_XEGPUATTRS_TD
 
 include "mlir/Dialect/XeGPU/IR/XeGPUDialect.td"
+include "mlir/IR/AttrTypeBase.td"
 include "mlir/IR/EnumAttr.td"
 
 class XeGPUAttr traits = [],
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td
index c2f09319c790e0..765f218f95d269 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUDialect.td
@@ -17,12 +17,14 @@ def XeGPU_Dialect : Dialect {
 let summary = "The XeGPU dialect that models Intel GPU's ISA";
 let description = [{
   The XeGPU dialect models Intel Xe ISA semantics but works at vector and
-  TensorDesc data type. It provides 1:1 mappings to match Xe instructions 
+  TensorDesc data type. It provides 1:1 mappings to match Xe instructions
   like DPAS and 2D block load. The matrix size being processed at this 
level
   exactly matches the hardware instructions or the intrinsic supported by
   the lower-level GPU compiler.
 }];
 
+let dependentDialects = ["arith::ArithDialect"];
+
 let useDefaultTypePrinterParser = true;
 let useDefaultAttributePrinterParser = true;
 }
diff --git a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td 
b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
index a031a75984a536..3423609b76c706 100644
--- a/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
+++ b/mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
@@ -9,7 +9,7 @@
 #ifndef MLIR_DIALECT_XEGPU_IR_XEGPUOPS_TD
 #define MLIR_DIALECT_XEGPU_IR_XEGPUOPS_TD
 
-include "mlir/IR/AttrTypeBase.td"
+include "mlir/Dialect/Arith/IR/ArithBase.td"
 include "mlir/Dialect/XeGPU/IR/XeGPUAttrs.td"
 include "mlir/Dialect/XeGPU/IR/XeGPUDialect.td"
 include "mlir/Dialect/XeGPU/IR/XeGPUTypes.td"
@@ -35,7 +35,7 @@ class XeGPU_Op traits = []>:
 
 static ::mlir::ParseResult parseProperties(::mlir::OpAsmParser &parser,
  ::mlir::OperationState &result) {
-  if (mlir::succeeded(parser.parseLess())) {
+  if (mlir::succeeded(parser.parseOptionalLess())) {
 if (parser.parseAttribute(result.propertiesAttr) || 
parser.parseGreater())
   return failure();
   }
@@ -253,7 +253,7 @@ def XeG

[llvm-branch-commits] [llvm] 5415528 - Revert "[CodeGen] Update for scalable MemoryType in MMO (#70452)"

2024-04-15 Thread via llvm-branch-commits

Author: AdityaK
Date: 2024-04-15T15:08:37-07:00
New Revision: 5415528397880c89b5408eed6131aa2c752797a1

URL: 
https://github.com/llvm/llvm-project/commit/5415528397880c89b5408eed6131aa2c752797a1
DIFF: 
https://github.com/llvm/llvm-project/commit/5415528397880c89b5408eed6131aa2c752797a1.diff

LOG: Revert "[CodeGen] Update for scalable MemoryType in MMO (#70452)"

This reverts commit 57146daeaaf366050dc913db910fcc2995a3e06d.

Added: 


Modified: 
llvm/include/llvm/Analysis/MemoryLocation.h
llvm/include/llvm/CodeGen/MachineFunction.h
llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
llvm/lib/CodeGen/MachineInstr.cpp
llvm/lib/CodeGen/MachineOperand.cpp
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
llvm/lib/CodeGen/SelectionDAG/SelectionDAGAddressAnalysis.cpp
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
llvm/test/CodeGen/AArch64/aarch64-sme2-asm.ll
llvm/test/CodeGen/AArch64/alloca-load-store-scalable-array.ll
llvm/test/CodeGen/AArch64/alloca-load-store-scalable-struct.ll
llvm/test/CodeGen/RISCV/rvv/alloca-load-store-scalable-array.ll
llvm/test/CodeGen/RISCV/rvv/alloca-load-store-scalable-struct.ll
llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops-mir.ll

Removed: 




diff  --git a/llvm/include/llvm/Analysis/MemoryLocation.h 
b/llvm/include/llvm/Analysis/MemoryLocation.h
index 7d896c44f46795..830eed5d60ee46 100644
--- a/llvm/include/llvm/Analysis/MemoryLocation.h
+++ b/llvm/include/llvm/Analysis/MemoryLocation.h
@@ -297,6 +297,13 @@ class MemoryLocation {
 return MemoryLocation(Ptr, LocationSize::beforeOrAfterPointer(), AATags);
   }
 
+  // Return the exact size if the exact size is known at compiletime,
+  // otherwise return LocationSize::beforeOrAfterPointer().
+  static LocationSize getSizeOrUnknown(const TypeSize &T) {
+return T.isScalable() ? LocationSize::beforeOrAfterPointer()
+  : LocationSize::precise(T.getFixedValue());
+  }
+
   MemoryLocation() : Ptr(nullptr), Size(LocationSize::beforeOrAfterPointer()) 
{}
 
   explicit MemoryLocation(const Value *Ptr, LocationSize Size,

diff  --git a/llvm/include/llvm/CodeGen/MachineFunction.h 
b/llvm/include/llvm/CodeGen/MachineFunction.h
index 470997b31fe85f..a0bc3aa1ed3140 100644
--- a/llvm/include/llvm/CodeGen/MachineFunction.h
+++ b/llvm/include/llvm/CodeGen/MachineFunction.h
@@ -1060,9 +1060,8 @@ class LLVM_EXTERNAL_VISIBILITY MachineFunction {
   int64_t Offset, LocationSize Size) {
 return getMachineMemOperand(
 MMO, Offset,
-!Size.hasValue() ? LLT()
-: Size.isScalable()
-? LLT::scalable_vector(1, 8 * Size.getValue().getKnownMinValue())
+!Size.hasValue() || Size.isScalable()
+? LLT()
 : LLT::scalar(8 * Size.getValue().getKnownMinValue()));
   }
   MachineMemOperand *getMachineMemOperand(const MachineMemOperand *MMO,

diff  --git a/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp 
b/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
index fb9656c09ca39d..9fc8ecd60b03ff 100644
--- a/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp
@@ -128,14 +128,14 @@ bool GISelAddressing::aliasIsKnownForLoadStore(const 
MachineInstr &MI1,
 // vector objects on the stack.
 // BasePtr1 is PtrDiff away from BasePtr0. They alias if none of the
 // following situations arise:
-if (PtrDiff >= 0 && Size1.hasValue() && !Size1.isScalable()) {
+if (PtrDiff >= 0 && Size1.hasValue()) {
   // [BasePtr0]
   // [---BasePtr1--]
   // PtrDiff>
   IsAlias = !((int64_t)Size1.getValue() <= PtrDiff);
   return true;
 }
-if (PtrDiff < 0 && Size2.hasValue() && !Size2.isScalable()) {
+if (PtrDiff < 0 && Size2.hasValue()) {
   // [BasePtr0]
   // [---BasePtr1--]
   // =(-PtrDiff)>
@@ -248,20 +248,10 @@ bool GISelAddressing::instMayAlias(const MachineInstr &MI,
   return false;
   }
 
-  // If NumBytes is scalable and offset is not 0, conservatively return may
-  // alias
-  if ((MUC0.NumBytes.isScalable() && MUC0.Offset != 0) ||
-  (MUC1.NumBytes.isScalable() && MUC1.Offset != 0))
-return true;
-
-  const bool BothNotScalable =
-  !MUC0.NumBytes.isScalable() && !MUC1.NumBytes.isScalable();
-
   // Try to prove that there is aliasing, or that there is no aliasing. Either
   // way, we can return now. If nothing can be proved, proceed with more tests.
   bool IsAlias;
-  if (BothNotScalable &&
-  GISelAddressing::aliasIsKnownForLoadStore(MI, Other, IsAlias, MRI))
+  if (GISelAddressing::aliasIsKnownForLoadStore(MI, Other, IsAlias, MRI))
 return IsAlias;
 
 

[llvm-branch-commits] [llvm] 5557274 - [RISCV] Test for bug-88799

2024-04-15 Thread via llvm-branch-commits

Author: AdityaK
Date: 2024-04-15T15:18:05-07:00
New Revision: 555727461739dd9a9cf975e70767f5e3ca95f340

URL: 
https://github.com/llvm/llvm-project/commit/555727461739dd9a9cf975e70767f5e3ca95f340
DIFF: 
https://github.com/llvm/llvm-project/commit/555727461739dd9a9cf975e70767f5e3ca95f340.diff

LOG: [RISCV] Test for bug-88799

Added: 
llvm/test/CodeGen/RISCV/bug-88799-scalable-memory-type.ll

Modified: 


Removed: 




diff  --git a/llvm/test/CodeGen/RISCV/bug-88799-scalable-memory-type.ll 
b/llvm/test/CodeGen/RISCV/bug-88799-scalable-memory-type.ll
new file mode 100644
index 00..e732db414bd472
--- /dev/null
+++ b/llvm/test/CodeGen/RISCV/bug-88799-scalable-memory-type.ll
@@ -0,0 +1,28 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --version 4
+; RUN: llc < %s | FileCheck %s -check-prefix=RV64I
+
+target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
+target triple = "riscv64-unknown-linux-gnu"
+
+; Function Attrs: vscale_range(2,2)
+define i32 @main() #0 {
+; RV64I-LABEL: main:
+; RV64I:   # %bb.0: # %vector.body
+; RV64I-NEXT:lui a0, 1040368
+; RV64I-NEXT:addiw a0, a0, -144
+; RV64I-NEXT:vl2re16.v v8, (a0)
+; RV64I-NEXT:vl2re16.v v10, (a0)
+; RV64I-NEXT:vs2r.v v8, (zero)
+; RV64I-NEXT:vs2r.v v10, (zero)
+; RV64I-NEXT:li a0, 0
+; RV64I-NEXT:ret
+vector.body:
+  %0 = load <16 x i16>, ptr getelementptr ([3 x [23 x [23 x i16]]], ptr null, 
i64 -10593, i64 1, i64 22, i64 0), align 16
+  store <16 x i16> %0, ptr null, align 2
+  %wide.load = load , ptr getelementptr ([3 x [23 x [23 x 
i16]]], ptr null, i64 -10593, i64 1, i64 22, i64 0), align 16
+  store  %wide.load, ptr null, align 2
+  ret i32 0
+}
+
+attributes #0 = { vscale_range(2,2) 
"target-features"="+64bit,+a,+c,+d,+f,+m,+relax,+v,+zicsr,+zifencei,+zve32f,+zve32x,+zve64d,+zve64f,+zve64x,+zvl128b,+zvl32b,+zvl64b,-e,-experimental-smmpm,-experimental-smnpm,-experimental-ssnpm,-experimental-sspm,-experimental-ssqosid,-experimental-supm,-experimental-zaamo,-experimental-zabha,-experimental-zalasr,-experimental-zalrsc,-experimental-zfbfmin,-experimental-zicfilp,-experimental-zicfiss,-experimental-ztso,-experimental-zvfbfmin,-experimental-zvfbfwma,-h,-shcounterenw,-shgatpa,-shtvala,-shvsatpa,-shvstvala,-shvstvecd,-smaia,-smepmp,-ssaia,-ssccptr,-sscofpmf,-sscounterenw,-ssstateen,-ssstrict,-sstc,-sstvala,-sstvecd,-ssu64xl,-svade,-svadu,-svbare,-svinval,-svnapot,-svpbmt,-xcvalu,-xcvbi,-xcvbitmanip,-xcvelw,-xcvmac,-xcvmem,-xcvsimd,-xsfcease,-xsfvcp,-xsfvfnrclipxfqf,-xsfvfwmaccqqq,-xsfvqmaccdod,-xsfvqmaccqoq,-xsifivecdiscarddlone,-xsifivecflushdlone,-xtheadba,-xtheadbb,-xtheadbs,-xtheadcmo,-xtheadcondmov,-xtheadfmemidx,-xtheadmac,-xtheadmemidx,-xtheadmempair,-xtheadsync,-xtheadvdot,-xventanacondops,-za128rs,-za64rs,-zacas,-zawrs,-zba,-zbb,-zbc,-zbkb,-zbkc,-zbkx,-zbs,-zca,-zcb,-zcd,-zce,-zcf,-zcmop,-zcmp,-zcmt,-zdinx,-zfa,-zfh,-zfhmin,-zfinx,-zhinx,-zhinxmin,-zic64b,-zicbom,-zicbop,-zicboz,-ziccamoa,-ziccif,-zicclsm,-ziccrse,-zicntr,-zicond,-zihintntl,-zihintpause,-zihpm,-zimop,-zk,-zkn,-zknd,-zkne,-zknh,-zkr,-zks,-zksed,-zksh,-zkt,-zmmul,-zvbb,-zvbc,-zvfh,-zvfhmin,-zvkb,-zvkg,-zvkn,-zvknc,-zvkned,-zvkng,-zvknha,-zvknhb,-zvks,-zvksc,-zvksed,-zvksg,-zvksh,-zvkt,-zvl1024b,-zvl16384b,-zvl2048b,-zvl256b,-zvl32768b,-zvl4096b,-zvl512b,-zvl65536b,-zvl8192b"
 }
+



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] release/18.x: [libcxx] coerce formatter precision to int (#87738) (PR #87801)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/87801

>From d89da2ac8839204dec5db6466d5b71efed6bfd4d Mon Sep 17 00:00:00 2001
From: Brian Cain 
Date: Fri, 5 Apr 2024 11:06:37 -0500
Subject: [PATCH] [libcxx] coerce formatter precision to int (#87738)

__precision_ is declared as an int32_t which on some hexagon platforms
is defined as a long.

This change fixes errors like the ones below:

In file included from
/local/mnt/workspace/hex/llvm-project/libcxx/test/libcxx/diagnostics/format.nodiscard_extensions.compile.pass.cpp:19:
In file included from
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/format:202:
In file included from
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:29:

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/formatter_floating_point.h:700:17:
error: no matching function for call to 'max'
700 | int __p = std::max(1, (__specs.__has_precision() ?
__specs.__precision_ : 6));
  | ^~~~

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/formatter_floating_point.h:771:25:
note: in instantiation of function template specialization
'std::__formatter::__format_floating_point' requested here
771 | return __formatter::__format_floating_point(__value, __ctx,
__parser_.__get_parsed_std_specifications(__ctx));
  | ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:284:42:
note: in instantiation of function template specialization
'std::__formatter_floating_point::format' requested here
284 | __ctx.advance_to(__formatter.format(__arg, __ctx));
  |  ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:429:15:
note: in instantiation of function template specialization
'std::__vformat_to, char,
std::back_insert_iterator>>'
requested here
429 | return std::__vformat_to(std::move(__out_it), __fmt, __args);
  |   ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:462:8:
note: in instantiation of function template specialization
'std::vformat_to>' requested here
  462 |   std::vformat_to(std::back_inserter(__res), __fmt, __args);
  |^

/local/mnt/workspace/hex/llvm-project/libcxx/test/libcxx/diagnostics/format.nodiscard_extensions.compile.pass.cpp:29:8:
note: in instantiation of function template specialization
'std::vformat' requested here
   29 |   std::vformat("", std::make_format_args());
  |^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:35:1:
note: candidate template ignored: deduced conflicting types for
parameter '_Tp' ('int' vs. 'int32_t' (aka 'long'))
35 | max(_LIBCPP_LIFETIMEBOUND const _Tp& __a, _LIBCPP_LIFETIMEBOUND
const _Tp& __b) {
  | ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:43:1:
note: candidate template ignored: could not match
'initializer_list<_Tp>' against 'int'
   43 | max(initializer_list<_Tp> __t, _Compare __comp) {
  | ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:48:86:
note: candidate function template not viable: requires single argument
'__t', but 2 arguments were provided
48 | _LIBCPP_NODISCARD_EXT inline _LIBCPP_HIDE_FROM_ABI
_LIBCPP_CONSTEXPR_SINCE_CXX14 _Tp max(initializer_list<_Tp> __t) {
| ^ ~

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:29:1:
note: candidate function template not viable: requires 3 arguments, but
2 were provided
29 | max(_LIBCPP_LIFETIMEBOUND const _Tp& __a, _LIBCPP_LIFETIMEBOUND
const _Tp& __b, _Compare __comp) {
| ^
~

(cherry picked from commit e1830f586ac4c504f632bdb69aab49234256e899)
---
 libcxx/include/__format/formatter_floating_point.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libcxx/include/__format/formatter_floating_point.h 
b/libcxx/include/__format/formatter_floating_point.h
index 6802a8b7bd4ca3..46a090a787ae28 100644
--- a/libcxx/include/__format/formatter_floating_point.h
+++ b/libcxx/include/__format/formatter_floating_point.h
@@ -689,7 +689,7 @@ __format_floating_point(_Tp __value, _FormatContext& __ctx, 
__format_spec::__par
   // Let P equal the precision if nonzero, 6 if the precision is not
   // specified, or 1 if the precision is 0. Then, if a conversion with
   // style E would have an exponent of X:
-  int __p = std::max(1, (__specs.__has_precision() ? __specs.__precision_ 
: 6));
+  int __p = std::max(1, (__sp

[llvm-branch-commits] [libcxx] d89da2a - [libcxx] coerce formatter precision to int (#87738)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

Author: Brian Cain
Date: 2024-04-15T15:31:46-07:00
New Revision: d89da2ac8839204dec5db6466d5b71efed6bfd4d

URL: 
https://github.com/llvm/llvm-project/commit/d89da2ac8839204dec5db6466d5b71efed6bfd4d
DIFF: 
https://github.com/llvm/llvm-project/commit/d89da2ac8839204dec5db6466d5b71efed6bfd4d.diff

LOG: [libcxx] coerce formatter precision to int (#87738)

__precision_ is declared as an int32_t which on some hexagon platforms
is defined as a long.

This change fixes errors like the ones below:

In file included from
/local/mnt/workspace/hex/llvm-project/libcxx/test/libcxx/diagnostics/format.nodiscard_extensions.compile.pass.cpp:19:
In file included from
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/format:202:
In file included from
/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:29:

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/formatter_floating_point.h:700:17:
error: no matching function for call to 'max'
700 | int __p = std::max(1, (__specs.__has_precision() ?
__specs.__precision_ : 6));
  | ^~~~

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/formatter_floating_point.h:771:25:
note: in instantiation of function template specialization
'std::__formatter::__format_floating_point' requested here
771 | return __formatter::__format_floating_point(__value, __ctx,
__parser_.__get_parsed_std_specifications(__ctx));
  | ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:284:42:
note: in instantiation of function template specialization
'std::__formatter_floating_point::format' requested here
284 | __ctx.advance_to(__formatter.format(__arg, __ctx));
  |  ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:429:15:
note: in instantiation of function template specialization
'std::__vformat_to, char,
std::back_insert_iterator>>'
requested here
429 | return std::__vformat_to(std::move(__out_it), __fmt, __args);
  |   ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__format/format_functions.h:462:8:
note: in instantiation of function template specialization
'std::vformat_to>' requested here
  462 |   std::vformat_to(std::back_inserter(__res), __fmt, __args);
  |^

/local/mnt/workspace/hex/llvm-project/libcxx/test/libcxx/diagnostics/format.nodiscard_extensions.compile.pass.cpp:29:8:
note: in instantiation of function template specialization
'std::vformat' requested here
   29 |   std::vformat("", std::make_format_args());
  |^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:35:1:
note: candidate template ignored: deduced conflicting types for
parameter '_Tp' ('int' vs. 'int32_t' (aka 'long'))
35 | max(_LIBCPP_LIFETIMEBOUND const _Tp& __a, _LIBCPP_LIFETIMEBOUND
const _Tp& __b) {
  | ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:43:1:
note: candidate template ignored: could not match
'initializer_list<_Tp>' against 'int'
   43 | max(initializer_list<_Tp> __t, _Compare __comp) {
  | ^

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:48:86:
note: candidate function template not viable: requires single argument
'__t', but 2 arguments were provided
48 | _LIBCPP_NODISCARD_EXT inline _LIBCPP_HIDE_FROM_ABI
_LIBCPP_CONSTEXPR_SINCE_CXX14 _Tp max(initializer_list<_Tp> __t) {
| ^ ~

/local/mnt/workspace/hex/obj_runtimes_hex88_qurt_v75_ON_ON_shared/include/c++/v1/__algorithm/max.h:29:1:
note: candidate function template not viable: requires 3 arguments, but
2 were provided
29 | max(_LIBCPP_LIFETIMEBOUND const _Tp& __a, _LIBCPP_LIFETIMEBOUND
const _Tp& __b, _Compare __comp) {
| ^
~

(cherry picked from commit e1830f586ac4c504f632bdb69aab49234256e899)

Added: 


Modified: 
libcxx/include/__format/formatter_floating_point.h

Removed: 




diff  --git a/libcxx/include/__format/formatter_floating_point.h 
b/libcxx/include/__format/formatter_floating_point.h
index 6802a8b7bd4ca3..46a090a787ae28 100644
--- a/libcxx/include/__format/formatter_floating_point.h
+++ b/libcxx/include/__format/formatter_floating_point.h
@@ -689,7 +689,7 @@ __format_floating_point(_Tp __value, _FormatContext& __ctx, 
__format_spec::__par
   // Let P equal the precision if nonzero, 6 if the precision is not
   // specified, or 1 if the precision is 0. Then, if a conversion with
   // style E would hav

[llvm-branch-commits] [libcxx] release/18.x: [libcxx] coerce formatter precision to int (#87738) (PR #87801)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/87801
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [SLP]Fix a crash if the argument of call was affected by minbitwidth analysis (PR #86731)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/86731

>From 6e071cf30599e821be56b75e6041cfedb7872216 Mon Sep 17 00:00:00 2001
From: Alexey Bataev 
Date: Thu, 21 Mar 2024 17:05:50 -0700
Subject: [PATCH] [SLP]Fix a crash if the argument of call was affected by
 minbitwidth analysis.

Need to support proper type conversion for function arguments to avoid
compiler crash.
---
 .../Transforms/Vectorize/SLPVectorizer.cpp| 21 -
 .../X86/call-arg-reduced-by-minbitwidth.ll| 82 +++
 2 files changed, 102 insertions(+), 1 deletion(-)
 create mode 100644 
llvm/test/Transforms/SLPVectorizer/X86/call-arg-reduced-by-minbitwidth.ll

diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 0a9e2c7f49f55f..1fbd69e38eaeec 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -11653,12 +11653,12 @@ Value *BoUpSLP::vectorizeTree(TreeEntry *E, bool 
PostponedPHIs) {
   if (UseIntrinsic && isVectorIntrinsicWithOverloadTypeAtArg(ID, -1))
 TysForDecl.push_back(
 FixedVectorType::get(CI->getType(), E->Scalars.size()));
+  auto *CEI = cast(VL0);
   for (unsigned I : seq(0, CI->arg_size())) {
 ValueList OpVL;
 // Some intrinsics have scalar arguments. This argument should not be
 // vectorized.
 if (UseIntrinsic && isVectorIntrinsicWithScalarOpAtArg(ID, I)) {
-  CallInst *CEI = cast(VL0);
   ScalarArg = CEI->getArgOperand(I);
   OpVecs.push_back(CEI->getArgOperand(I));
   if (isVectorIntrinsicWithOverloadTypeAtArg(ID, I))
@@ -11671,6 +11671,25 @@ Value *BoUpSLP::vectorizeTree(TreeEntry *E, bool 
PostponedPHIs) {
   LLVM_DEBUG(dbgs() << "SLP: Diamond merged for " << *VL0 << ".\n");
   return E->VectorizedValue;
 }
+auto GetOperandSignedness = [&](unsigned Idx) {
+  const TreeEntry *OpE = getOperandEntry(E, Idx);
+  bool IsSigned = false;
+  auto It = MinBWs.find(OpE);
+  if (It != MinBWs.end())
+IsSigned = It->second.second;
+  else
+IsSigned = any_of(OpE->Scalars, [&](Value *R) {
+  return !isKnownNonNegative(R, SimplifyQuery(*DL));
+});
+  return IsSigned;
+};
+ScalarArg = CEI->getArgOperand(I);
+if (cast(OpVec->getType())->getElementType() !=
+ScalarArg->getType()) {
+  auto *CastTy = FixedVectorType::get(ScalarArg->getType(),
+  VecTy->getNumElements());
+  OpVec = Builder.CreateIntCast(OpVec, CastTy, 
GetOperandSignedness(I));
+}
 LLVM_DEBUG(dbgs() << "SLP: OpVec[" << I << "]: " << *OpVec << "\n");
 OpVecs.push_back(OpVec);
 if (UseIntrinsic && isVectorIntrinsicWithOverloadTypeAtArg(ID, I))
diff --git 
a/llvm/test/Transforms/SLPVectorizer/X86/call-arg-reduced-by-minbitwidth.ll 
b/llvm/test/Transforms/SLPVectorizer/X86/call-arg-reduced-by-minbitwidth.ll
new file mode 100644
index 00..49e89feb475b95
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/X86/call-arg-reduced-by-minbitwidth.ll
@@ -0,0 +1,82 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -S --passes=slp-vectorizer -mtriple=x86_64-pc-windows-msvc19.34.0 < 
%s | FileCheck %s
+
+define void @test(ptr %0, i8 %1, i1 %cmp12.i) {
+; CHECK-LABEL: define void @test(
+; CHECK-SAME: ptr [[TMP0:%.*]], i8 [[TMP1:%.*]], i1 [[CMP12_I:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[TMP2:%.*]] = insertelement <8 x i1> poison, i1 [[CMP12_I]], 
i32 0
+; CHECK-NEXT:[[TMP3:%.*]] = shufflevector <8 x i1> [[TMP2]], <8 x i1> 
poison, <8 x i32> zeroinitializer
+; CHECK-NEXT:[[TMP4:%.*]] = insertelement <8 x i8> poison, i8 [[TMP1]], 
i32 0
+; CHECK-NEXT:[[TMP5:%.*]] = shufflevector <8 x i8> [[TMP4]], <8 x i8> 
poison, <8 x i32> zeroinitializer
+; CHECK-NEXT:br label [[PRE:%.*]]
+; CHECK:   pre:
+; CHECK-NEXT:[[TMP6:%.*]] = zext <8 x i8> [[TMP5]] to <8 x i32>
+; CHECK-NEXT:[[TMP7:%.*]] = call <8 x i32> @llvm.umax.v8i32(<8 x i32> 
[[TMP6]], <8 x i32> )
+; CHECK-NEXT:[[TMP8:%.*]] = add <8 x i32> [[TMP7]], 
+; CHECK-NEXT:[[TMP9:%.*]] = select <8 x i1> [[TMP3]], <8 x i32> [[TMP8]], 
<8 x i32> [[TMP6]]
+; CHECK-NEXT:[[TMP10:%.*]] = trunc <8 x i32> [[TMP9]] to <8 x i8>
+; CHECK-NEXT:store <8 x i8> [[TMP10]], ptr [[TMP0]], align 1
+; CHECK-NEXT:br label [[PRE]]
+;
+entry:
+  %idx11 = getelementptr i8, ptr %0, i64 1
+  %idx22 = getelementptr i8, ptr %0, i64 2
+  %idx33 = getelementptr i8, ptr %0, i64 3
+  %idx44 = getelementptr i8, ptr %0, i64 4
+  %idx55 = getelementptr i8, ptr %0, i64 5
+  %idx66 = getelementptr i8, ptr %0, i64 6
+  %idx77 = getelementptr i8, ptr %0, i64 7
+  br label %pre
+
+pre:
+  %conv.i = zext i8 %1 to i32
+  %2 = tail call i

[llvm-branch-commits] [llvm] 6e071cf - [SLP]Fix a crash if the argument of call was affected by minbitwidth analysis.

2024-04-15 Thread Tom Stellard via llvm-branch-commits

Author: Alexey Bataev
Date: 2024-04-15T15:33:32-07:00
New Revision: 6e071cf30599e821be56b75e6041cfedb7872216

URL: 
https://github.com/llvm/llvm-project/commit/6e071cf30599e821be56b75e6041cfedb7872216
DIFF: 
https://github.com/llvm/llvm-project/commit/6e071cf30599e821be56b75e6041cfedb7872216.diff

LOG: [SLP]Fix a crash if the argument of call was affected by minbitwidth 
analysis.

Need to support proper type conversion for function arguments to avoid
compiler crash.

Added: 
llvm/test/Transforms/SLPVectorizer/X86/call-arg-reduced-by-minbitwidth.ll

Modified: 
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp 
b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index 0a9e2c7f49f55f..1fbd69e38eaeec 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -11653,12 +11653,12 @@ Value *BoUpSLP::vectorizeTree(TreeEntry *E, bool 
PostponedPHIs) {
   if (UseIntrinsic && isVectorIntrinsicWithOverloadTypeAtArg(ID, -1))
 TysForDecl.push_back(
 FixedVectorType::get(CI->getType(), E->Scalars.size()));
+  auto *CEI = cast(VL0);
   for (unsigned I : seq(0, CI->arg_size())) {
 ValueList OpVL;
 // Some intrinsics have scalar arguments. This argument should not be
 // vectorized.
 if (UseIntrinsic && isVectorIntrinsicWithScalarOpAtArg(ID, I)) {
-  CallInst *CEI = cast(VL0);
   ScalarArg = CEI->getArgOperand(I);
   OpVecs.push_back(CEI->getArgOperand(I));
   if (isVectorIntrinsicWithOverloadTypeAtArg(ID, I))
@@ -11671,6 +11671,25 @@ Value *BoUpSLP::vectorizeTree(TreeEntry *E, bool 
PostponedPHIs) {
   LLVM_DEBUG(dbgs() << "SLP: Diamond merged for " << *VL0 << ".\n");
   return E->VectorizedValue;
 }
+auto GetOperandSignedness = [&](unsigned Idx) {
+  const TreeEntry *OpE = getOperandEntry(E, Idx);
+  bool IsSigned = false;
+  auto It = MinBWs.find(OpE);
+  if (It != MinBWs.end())
+IsSigned = It->second.second;
+  else
+IsSigned = any_of(OpE->Scalars, [&](Value *R) {
+  return !isKnownNonNegative(R, SimplifyQuery(*DL));
+});
+  return IsSigned;
+};
+ScalarArg = CEI->getArgOperand(I);
+if (cast(OpVec->getType())->getElementType() !=
+ScalarArg->getType()) {
+  auto *CastTy = FixedVectorType::get(ScalarArg->getType(),
+  VecTy->getNumElements());
+  OpVec = Builder.CreateIntCast(OpVec, CastTy, 
GetOperandSignedness(I));
+}
 LLVM_DEBUG(dbgs() << "SLP: OpVec[" << I << "]: " << *OpVec << "\n");
 OpVecs.push_back(OpVec);
 if (UseIntrinsic && isVectorIntrinsicWithOverloadTypeAtArg(ID, I))

diff  --git 
a/llvm/test/Transforms/SLPVectorizer/X86/call-arg-reduced-by-minbitwidth.ll 
b/llvm/test/Transforms/SLPVectorizer/X86/call-arg-reduced-by-minbitwidth.ll
new file mode 100644
index 00..49e89feb475b95
--- /dev/null
+++ b/llvm/test/Transforms/SLPVectorizer/X86/call-arg-reduced-by-minbitwidth.ll
@@ -0,0 +1,82 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -S --passes=slp-vectorizer -mtriple=x86_64-pc-windows-msvc19.34.0 < 
%s | FileCheck %s
+
+define void @test(ptr %0, i8 %1, i1 %cmp12.i) {
+; CHECK-LABEL: define void @test(
+; CHECK-SAME: ptr [[TMP0:%.*]], i8 [[TMP1:%.*]], i1 [[CMP12_I:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[TMP2:%.*]] = insertelement <8 x i1> poison, i1 [[CMP12_I]], 
i32 0
+; CHECK-NEXT:[[TMP3:%.*]] = shufflevector <8 x i1> [[TMP2]], <8 x i1> 
poison, <8 x i32> zeroinitializer
+; CHECK-NEXT:[[TMP4:%.*]] = insertelement <8 x i8> poison, i8 [[TMP1]], 
i32 0
+; CHECK-NEXT:[[TMP5:%.*]] = shufflevector <8 x i8> [[TMP4]], <8 x i8> 
poison, <8 x i32> zeroinitializer
+; CHECK-NEXT:br label [[PRE:%.*]]
+; CHECK:   pre:
+; CHECK-NEXT:[[TMP6:%.*]] = zext <8 x i8> [[TMP5]] to <8 x i32>
+; CHECK-NEXT:[[TMP7:%.*]] = call <8 x i32> @llvm.umax.v8i32(<8 x i32> 
[[TMP6]], <8 x i32> )
+; CHECK-NEXT:[[TMP8:%.*]] = add <8 x i32> [[TMP7]], 
+; CHECK-NEXT:[[TMP9:%.*]] = select <8 x i1> [[TMP3]], <8 x i32> [[TMP8]], 
<8 x i32> [[TMP6]]
+; CHECK-NEXT:[[TMP10:%.*]] = trunc <8 x i32> [[TMP9]] to <8 x i8>
+; CHECK-NEXT:store <8 x i8> [[TMP10]], ptr [[TMP0]], align 1
+; CHECK-NEXT:br label [[PRE]]
+;
+entry:
+  %idx11 = getelementptr i8, ptr %0, i64 1
+  %idx22 = getelementptr i8, ptr %0, i64 2
+  %idx33 = getelementptr i8, ptr %0, i64 3
+  %idx44 = getelementptr i8, ptr %0, i64 4
+  %idx55 = getelementptr i8, ptr %0, i64 5
+  %idx66 = getelementptr i8, ptr %0, i64 6
+  %idx77 = getelementptr i8, ptr %0, i64 7
+  br label %pre
+
+pre:
+  %conv

[llvm-branch-commits] [llvm] release/18.x: [SLP]Fix a crash if the argument of call was affected by minbitwidth analysis (PR #86731)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/86731
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: Prepend all library intrinsics with `#` when building for Arm64EC (PR #88016)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/88016

>From 4056cc29dfd3cd40f481b499936f15bc85abd75f Mon Sep 17 00:00:00 2001
From: Daniel Paoliello 
Date: Fri, 5 Apr 2024 12:06:47 -0700
Subject: [PATCH] Prepend all library intrinsics with `#` when building for
 Arm64EC (#87542)

While attempting to build some Rust code, I was getting linker errors
due to missing functions that are implemented in `compiler-rt`. Turns
out that when `compiler-rt` is built for Arm64EC, all its function names
are mangled with the leading `#`.

This change removes the hard-coded list of library-implemented
intrinsics to mangle for Arm64EC, and instead assumes that they all must
be mangled.
---
 .../Target/AArch64/AArch64ISelLowering.cpp| 42 ---
 llvm/lib/Target/AArch64/AArch64ISelLowering.h |  3 ++
 2 files changed, 11 insertions(+), 34 deletions(-)

diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 196aa50cf4060b..95d8ab95b2c097 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -1658,40 +1658,14 @@ AArch64TargetLowering::AArch64TargetLowering(const 
TargetMachine &TM,
   setMaxAtomicSizeInBitsSupported(128);
 
   if (Subtarget->isWindowsArm64EC()) {
-// FIXME: are there other intrinsics we need to add here?
-setLibcallName(RTLIB::MEMCPY, "#memcpy");
-setLibcallName(RTLIB::MEMSET, "#memset");
-setLibcallName(RTLIB::MEMMOVE, "#memmove");
-setLibcallName(RTLIB::REM_F32, "#fmodf");
-setLibcallName(RTLIB::REM_F64, "#fmod");
-setLibcallName(RTLIB::FMA_F32, "#fmaf");
-setLibcallName(RTLIB::FMA_F64, "#fma");
-setLibcallName(RTLIB::SQRT_F32, "#sqrtf");
-setLibcallName(RTLIB::SQRT_F64, "#sqrt");
-setLibcallName(RTLIB::CBRT_F32, "#cbrtf");
-setLibcallName(RTLIB::CBRT_F64, "#cbrt");
-setLibcallName(RTLIB::LOG_F32, "#logf");
-setLibcallName(RTLIB::LOG_F64, "#log");
-setLibcallName(RTLIB::LOG2_F32, "#log2f");
-setLibcallName(RTLIB::LOG2_F64, "#log2");
-setLibcallName(RTLIB::LOG10_F32, "#log10f");
-setLibcallName(RTLIB::LOG10_F64, "#log10");
-setLibcallName(RTLIB::EXP_F32, "#expf");
-setLibcallName(RTLIB::EXP_F64, "#exp");
-setLibcallName(RTLIB::EXP2_F32, "#exp2f");
-setLibcallName(RTLIB::EXP2_F64, "#exp2");
-setLibcallName(RTLIB::EXP10_F32, "#exp10f");
-setLibcallName(RTLIB::EXP10_F64, "#exp10");
-setLibcallName(RTLIB::SIN_F32, "#sinf");
-setLibcallName(RTLIB::SIN_F64, "#sin");
-setLibcallName(RTLIB::COS_F32, "#cosf");
-setLibcallName(RTLIB::COS_F64, "#cos");
-setLibcallName(RTLIB::POW_F32, "#powf");
-setLibcallName(RTLIB::POW_F64, "#pow");
-setLibcallName(RTLIB::LDEXP_F32, "#ldexpf");
-setLibcallName(RTLIB::LDEXP_F64, "#ldexp");
-setLibcallName(RTLIB::FREXP_F32, "#frexpf");
-setLibcallName(RTLIB::FREXP_F64, "#frexp");
+// FIXME: are there intrinsics we need to exclude from this?
+for (int i = 0; i < RTLIB::UNKNOWN_LIBCALL; ++i) {
+  auto code = static_cast(i);
+  auto libcallName = getLibcallName(code);
+  if ((libcallName != nullptr) && (libcallName[0] != '#')) {
+setLibcallName(code, Saver.save(Twine("#") + libcallName).data());
+  }
+}
   }
 }
 
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index 541a810fb5cba0..74d0c4bde8dd2e 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -1001,6 +1001,9 @@ class AArch64TargetLowering : public TargetLowering {
   /// make the right decision when generating code for different targets.
   const AArch64Subtarget *Subtarget;
 
+  llvm::BumpPtrAllocator BumpAlloc;
+  llvm::StringSaver Saver{BumpAlloc};
+
   bool isExtFreeImpl(const Instruction *Ext) const override;
 
   void addTypeForNEON(MVT VT);

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 4056cc2 - Prepend all library intrinsics with `#` when building for Arm64EC (#87542)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

Author: Daniel Paoliello
Date: 2024-04-15T15:57:42-07:00
New Revision: 4056cc29dfd3cd40f481b499936f15bc85abd75f

URL: 
https://github.com/llvm/llvm-project/commit/4056cc29dfd3cd40f481b499936f15bc85abd75f
DIFF: 
https://github.com/llvm/llvm-project/commit/4056cc29dfd3cd40f481b499936f15bc85abd75f.diff

LOG: Prepend all library intrinsics with `#` when building for Arm64EC (#87542)

While attempting to build some Rust code, I was getting linker errors
due to missing functions that are implemented in `compiler-rt`. Turns
out that when `compiler-rt` is built for Arm64EC, all its function names
are mangled with the leading `#`.

This change removes the hard-coded list of library-implemented
intrinsics to mangle for Arm64EC, and instead assumes that they all must
be mangled.

Added: 


Modified: 
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
llvm/lib/Target/AArch64/AArch64ISelLowering.h

Removed: 




diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 196aa50cf4060b..95d8ab95b2c097 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -1658,40 +1658,14 @@ AArch64TargetLowering::AArch64TargetLowering(const 
TargetMachine &TM,
   setMaxAtomicSizeInBitsSupported(128);
 
   if (Subtarget->isWindowsArm64EC()) {
-// FIXME: are there other intrinsics we need to add here?
-setLibcallName(RTLIB::MEMCPY, "#memcpy");
-setLibcallName(RTLIB::MEMSET, "#memset");
-setLibcallName(RTLIB::MEMMOVE, "#memmove");
-setLibcallName(RTLIB::REM_F32, "#fmodf");
-setLibcallName(RTLIB::REM_F64, "#fmod");
-setLibcallName(RTLIB::FMA_F32, "#fmaf");
-setLibcallName(RTLIB::FMA_F64, "#fma");
-setLibcallName(RTLIB::SQRT_F32, "#sqrtf");
-setLibcallName(RTLIB::SQRT_F64, "#sqrt");
-setLibcallName(RTLIB::CBRT_F32, "#cbrtf");
-setLibcallName(RTLIB::CBRT_F64, "#cbrt");
-setLibcallName(RTLIB::LOG_F32, "#logf");
-setLibcallName(RTLIB::LOG_F64, "#log");
-setLibcallName(RTLIB::LOG2_F32, "#log2f");
-setLibcallName(RTLIB::LOG2_F64, "#log2");
-setLibcallName(RTLIB::LOG10_F32, "#log10f");
-setLibcallName(RTLIB::LOG10_F64, "#log10");
-setLibcallName(RTLIB::EXP_F32, "#expf");
-setLibcallName(RTLIB::EXP_F64, "#exp");
-setLibcallName(RTLIB::EXP2_F32, "#exp2f");
-setLibcallName(RTLIB::EXP2_F64, "#exp2");
-setLibcallName(RTLIB::EXP10_F32, "#exp10f");
-setLibcallName(RTLIB::EXP10_F64, "#exp10");
-setLibcallName(RTLIB::SIN_F32, "#sinf");
-setLibcallName(RTLIB::SIN_F64, "#sin");
-setLibcallName(RTLIB::COS_F32, "#cosf");
-setLibcallName(RTLIB::COS_F64, "#cos");
-setLibcallName(RTLIB::POW_F32, "#powf");
-setLibcallName(RTLIB::POW_F64, "#pow");
-setLibcallName(RTLIB::LDEXP_F32, "#ldexpf");
-setLibcallName(RTLIB::LDEXP_F64, "#ldexp");
-setLibcallName(RTLIB::FREXP_F32, "#frexpf");
-setLibcallName(RTLIB::FREXP_F64, "#frexp");
+// FIXME: are there intrinsics we need to exclude from this?
+for (int i = 0; i < RTLIB::UNKNOWN_LIBCALL; ++i) {
+  auto code = static_cast(i);
+  auto libcallName = getLibcallName(code);
+  if ((libcallName != nullptr) && (libcallName[0] != '#')) {
+setLibcallName(code, Saver.save(Twine("#") + libcallName).data());
+  }
+}
   }
 }
 

diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.h 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
index 541a810fb5cba0..74d0c4bde8dd2e 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.h
@@ -1001,6 +1001,9 @@ class AArch64TargetLowering : public TargetLowering {
   /// make the right decision when generating code for 
diff erent targets.
   const AArch64Subtarget *Subtarget;
 
+  llvm::BumpPtrAllocator BumpAlloc;
+  llvm::StringSaver Saver{BumpAlloc};
+
   bool isExtFreeImpl(const Instruction *Ext) const override;
 
   void addTypeForNEON(MVT VT);



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: Prepend all library intrinsics with `#` when building for Arm64EC (PR #88016)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/88016
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [InstSimplify] Make sure the simplified value doesn't generate poison in threadBinOpOverSelect (#87075) (PR #88353)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/88353

>From d0ddcce21d91eb8755ba45d700f6e0c6f6671a79 Mon Sep 17 00:00:00 2001
From: Yingwei Zheng 
Date: Thu, 11 Apr 2024 12:48:52 +0800
Subject: [PATCH] [InstSimplify] Make sure the simplified value doesn't
 generate poison in threadBinOpOverSelect (#87075)

Alive2: https://alive2.llvm.org/ce/z/y_Jmdn
Fix https://github.com/llvm/llvm-project/issues/87042.

(cherry picked from commit 3197f9d8b0efc3efdc531421bd11c16305d9b1ff)
---
 llvm/lib/Analysis/InstructionSimplify.cpp|  3 +-
 llvm/test/Transforms/InstSimplify/pr87042.ll | 42 
 2 files changed, 44 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/Transforms/InstSimplify/pr87042.ll

diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp 
b/llvm/lib/Analysis/InstructionSimplify.cpp
index d0c27cae0dff99..72b6dfa181e86d 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -439,7 +439,8 @@ static Value *threadBinOpOverSelect(Instruction::BinaryOps 
Opcode, Value *LHS,
 // Check that the simplified value has the form "X op Y" where "op" is the
 // same as the original operation.
 Instruction *Simplified = dyn_cast(FV ? FV : TV);
-if (Simplified && Simplified->getOpcode() == unsigned(Opcode)) {
+if (Simplified && Simplified->getOpcode() == unsigned(Opcode) &&
+!Simplified->hasPoisonGeneratingFlags()) {
   // The value that didn't simplify is "UnsimplifiedLHS op 
UnsimplifiedRHS".
   // We already know that "op" is the same as for the simplified value.  
See
   // if the operands match too.  If so, return the simplified value.
diff --git a/llvm/test/Transforms/InstSimplify/pr87042.ll 
b/llvm/test/Transforms/InstSimplify/pr87042.ll
new file mode 100644
index 00..800d27c9e65043
--- /dev/null
+++ b/llvm/test/Transforms/InstSimplify/pr87042.ll
@@ -0,0 +1,42 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt < %s -passes=instsimplify -S | FileCheck %s
+
+; %or2 cannot be folded into %or1 because %or1 has disjoint.
+; TODO: Can we move the logic into InstCombine and drop the disjoint flag?
+define i64 @test(i1 %cond, i64 %x) {
+; CHECK-LABEL: define i64 @test(
+; CHECK-SAME: i1 [[COND:%.*]], i64 [[X:%.*]]) {
+; CHECK-NEXT:[[OR1:%.*]] = or disjoint i64 [[X]], 7
+; CHECK-NEXT:[[SEL1:%.*]] = select i1 [[COND]], i64 [[OR1]], i64 [[X]]
+; CHECK-NEXT:[[OR2:%.*]] = or i64 [[SEL1]], 7
+; CHECK-NEXT:ret i64 [[OR2]]
+;
+  %or1 = or disjoint i64 %x, 7
+  %sel1 = select i1 %cond, i64 %or1, i64 %x
+  %or2 = or i64 %sel1, 7
+  ret i64 %or2
+}
+
+define i64 @pr87042(i64 %x) {
+; CHECK-LABEL: define i64 @pr87042(
+; CHECK-SAME: i64 [[X:%.*]]) {
+; CHECK-NEXT:[[AND1:%.*]] = and i64 [[X]], 65535
+; CHECK-NEXT:[[CMP1:%.*]] = icmp eq i64 [[AND1]], 0
+; CHECK-NEXT:[[OR1:%.*]] = or disjoint i64 [[X]], 7
+; CHECK-NEXT:[[SEL1:%.*]] = select i1 [[CMP1]], i64 [[OR1]], i64 [[X]]
+; CHECK-NEXT:[[AND2:%.*]] = and i64 [[SEL1]], 16776960
+; CHECK-NEXT:[[CMP2:%.*]] = icmp eq i64 [[AND2]], 0
+; CHECK-NEXT:[[OR2:%.*]] = or i64 [[SEL1]], 7
+; CHECK-NEXT:[[SEL2:%.*]] = select i1 [[CMP2]], i64 [[OR2]], i64 [[SEL1]]
+; CHECK-NEXT:ret i64 [[SEL2]]
+;
+  %and1 = and i64 %x, 65535
+  %cmp1 = icmp eq i64 %and1, 0
+  %or1 = or disjoint i64 %x, 7
+  %sel1 = select i1 %cmp1, i64 %or1, i64 %x
+  %and2 = and i64 %sel1, 16776960
+  %cmp2 = icmp eq i64 %and2, 0
+  %or2 = or i64 %sel1, 7
+  %sel2 = select i1 %cmp2, i64 %or2, i64 %sel1
+  ret i64 %sel2
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] d0ddcce - [InstSimplify] Make sure the simplified value doesn't generate poison in threadBinOpOverSelect (#87075)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

Author: Yingwei Zheng
Date: 2024-04-15T16:00:46-07:00
New Revision: d0ddcce21d91eb8755ba45d700f6e0c6f6671a79

URL: 
https://github.com/llvm/llvm-project/commit/d0ddcce21d91eb8755ba45d700f6e0c6f6671a79
DIFF: 
https://github.com/llvm/llvm-project/commit/d0ddcce21d91eb8755ba45d700f6e0c6f6671a79.diff

LOG: [InstSimplify] Make sure the simplified value doesn't generate poison in 
threadBinOpOverSelect (#87075)

Alive2: https://alive2.llvm.org/ce/z/y_Jmdn
Fix https://github.com/llvm/llvm-project/issues/87042.

(cherry picked from commit 3197f9d8b0efc3efdc531421bd11c16305d9b1ff)

Added: 
llvm/test/Transforms/InstSimplify/pr87042.ll

Modified: 
llvm/lib/Analysis/InstructionSimplify.cpp

Removed: 




diff  --git a/llvm/lib/Analysis/InstructionSimplify.cpp 
b/llvm/lib/Analysis/InstructionSimplify.cpp
index d0c27cae0dff99..72b6dfa181e86d 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -439,7 +439,8 @@ static Value *threadBinOpOverSelect(Instruction::BinaryOps 
Opcode, Value *LHS,
 // Check that the simplified value has the form "X op Y" where "op" is the
 // same as the original operation.
 Instruction *Simplified = dyn_cast(FV ? FV : TV);
-if (Simplified && Simplified->getOpcode() == unsigned(Opcode)) {
+if (Simplified && Simplified->getOpcode() == unsigned(Opcode) &&
+!Simplified->hasPoisonGeneratingFlags()) {
   // The value that didn't simplify is "UnsimplifiedLHS op 
UnsimplifiedRHS".
   // We already know that "op" is the same as for the simplified value.  
See
   // if the operands match too.  If so, return the simplified value.

diff  --git a/llvm/test/Transforms/InstSimplify/pr87042.ll 
b/llvm/test/Transforms/InstSimplify/pr87042.ll
new file mode 100644
index 00..800d27c9e65043
--- /dev/null
+++ b/llvm/test/Transforms/InstSimplify/pr87042.ll
@@ -0,0 +1,42 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt < %s -passes=instsimplify -S | FileCheck %s
+
+; %or2 cannot be folded into %or1 because %or1 has disjoint.
+; TODO: Can we move the logic into InstCombine and drop the disjoint flag?
+define i64 @test(i1 %cond, i64 %x) {
+; CHECK-LABEL: define i64 @test(
+; CHECK-SAME: i1 [[COND:%.*]], i64 [[X:%.*]]) {
+; CHECK-NEXT:[[OR1:%.*]] = or disjoint i64 [[X]], 7
+; CHECK-NEXT:[[SEL1:%.*]] = select i1 [[COND]], i64 [[OR1]], i64 [[X]]
+; CHECK-NEXT:[[OR2:%.*]] = or i64 [[SEL1]], 7
+; CHECK-NEXT:ret i64 [[OR2]]
+;
+  %or1 = or disjoint i64 %x, 7
+  %sel1 = select i1 %cond, i64 %or1, i64 %x
+  %or2 = or i64 %sel1, 7
+  ret i64 %or2
+}
+
+define i64 @pr87042(i64 %x) {
+; CHECK-LABEL: define i64 @pr87042(
+; CHECK-SAME: i64 [[X:%.*]]) {
+; CHECK-NEXT:[[AND1:%.*]] = and i64 [[X]], 65535
+; CHECK-NEXT:[[CMP1:%.*]] = icmp eq i64 [[AND1]], 0
+; CHECK-NEXT:[[OR1:%.*]] = or disjoint i64 [[X]], 7
+; CHECK-NEXT:[[SEL1:%.*]] = select i1 [[CMP1]], i64 [[OR1]], i64 [[X]]
+; CHECK-NEXT:[[AND2:%.*]] = and i64 [[SEL1]], 16776960
+; CHECK-NEXT:[[CMP2:%.*]] = icmp eq i64 [[AND2]], 0
+; CHECK-NEXT:[[OR2:%.*]] = or i64 [[SEL1]], 7
+; CHECK-NEXT:[[SEL2:%.*]] = select i1 [[CMP2]], i64 [[OR2]], i64 [[SEL1]]
+; CHECK-NEXT:ret i64 [[SEL2]]
+;
+  %and1 = and i64 %x, 65535
+  %cmp1 = icmp eq i64 %and1, 0
+  %or1 = or disjoint i64 %x, 7
+  %sel1 = select i1 %cmp1, i64 %or1, i64 %x
+  %and2 = and i64 %sel1, 16776960
+  %cmp2 = icmp eq i64 %and2, 0
+  %or2 = or i64 %sel1, 7
+  %sel2 = select i1 %cmp2, i64 %or2, i64 %sel1
+  ret i64 %sel2
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [InstSimplify] Make sure the simplified value doesn't generate poison in threadBinOpOverSelect (#87075) (PR #88353)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/88353
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [Codegen][X86] Fix /HOTPATCH with clang-cl and inline asm (#87639) (PR #88388)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/88388

>From c837970dd7e97a7ea8eac149ff19357d75c59083 Mon Sep 17 00:00:00 2001
From: Alexandre Ganea <37383324+aga...@users.noreply.github.com>
Date: Mon, 8 Apr 2024 20:02:19 -0400
Subject: [PATCH] [Codegen][X86] Fix /HOTPATCH with clang-cl and inline asm
 (#87639)

This fixes an edge case where functions starting with inline assembly
would assert while trying to lower that inline asm instruction.

After this PR, for now we always add a no-op (xchgw in this case) without
considering the size of the next inline asm instruction. We might want
to revisit this in the future.

This fixes Unreal Engine 5.3.2 compilation with clang-cl and /HOTPATCH.

Should close https://github.com/llvm/llvm-project/issues/56234

(cherry picked from commit ec1af63dde58c735fe60d6f2aafdb10fa93f410d)
---
 llvm/lib/Target/X86/X86MCInstLower.cpp  |  4 +++-
 llvm/test/CodeGen/X86/patchable-prologue.ll | 17 +
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/X86/X86MCInstLower.cpp 
b/llvm/lib/Target/X86/X86MCInstLower.cpp
index 58ebe023cd61ec..7ce0aa22b99795 100644
--- a/llvm/lib/Target/X86/X86MCInstLower.cpp
+++ b/llvm/lib/Target/X86/X86MCInstLower.cpp
@@ -959,8 +959,10 @@ void X86AsmPrinter::LowerPATCHABLE_OP(const MachineInstr 
&MI,
   SmallString<256> Code;
   unsigned MinSize = MI.getOperand(0).getImm();
 
-  if (NextMI != MI.getParent()->end()) {
+  if (NextMI != MI.getParent()->end() && !NextMI->isInlineAsm()) {
 // Lower the next MachineInstr to find its byte size.
+// If the next instruction is inline assembly, we skip lowering it for now,
+// and assume we should always generate NOPs.
 MCInst MCI;
 MCIL.Lower(&*NextMI, MCI);
 
diff --git a/llvm/test/CodeGen/X86/patchable-prologue.ll 
b/llvm/test/CodeGen/X86/patchable-prologue.ll
index 71a392845fdea3..43761e3d1e1eb9 100644
--- a/llvm/test/CodeGen/X86/patchable-prologue.ll
+++ b/llvm/test/CodeGen/X86/patchable-prologue.ll
@@ -193,3 +193,20 @@ do.body:  ; preds 
= %do.body, %entry
 do.end:   ; preds = %do.body
   ret void
 }
+
+
+; Test that inline asm is properly hotpatched. We currently don't examine the
+; asm instruction when printing it, thus we always emit patching NOPs.
+
+; 64: inline_asm:
+; 64-NEXT: # %bb.0:
+; 64-NEXT: xchgw   %ax, %ax# encoding: [0x66,0x90]
+; 64-NEXT: #APP
+; 64-NEXT: int3# encoding: [0xcc]
+; 64-NEXT: #NO_APP
+
+define dso_local void @inline_asm() 
"patchable-function"="prologue-short-redirect" {
+entry:
+  call void asm sideeffect "int3", "~{dirflag},~{fpsr},~{flags}"()
+  ret void
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] c837970 - [Codegen][X86] Fix /HOTPATCH with clang-cl and inline asm (#87639)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

Author: Alexandre Ganea
Date: 2024-04-15T16:02:49-07:00
New Revision: c837970dd7e97a7ea8eac149ff19357d75c59083

URL: 
https://github.com/llvm/llvm-project/commit/c837970dd7e97a7ea8eac149ff19357d75c59083
DIFF: 
https://github.com/llvm/llvm-project/commit/c837970dd7e97a7ea8eac149ff19357d75c59083.diff

LOG: [Codegen][X86] Fix /HOTPATCH with clang-cl and inline asm (#87639)

This fixes an edge case where functions starting with inline assembly
would assert while trying to lower that inline asm instruction.

After this PR, for now we always add a no-op (xchgw in this case) without
considering the size of the next inline asm instruction. We might want
to revisit this in the future.

This fixes Unreal Engine 5.3.2 compilation with clang-cl and /HOTPATCH.

Should close https://github.com/llvm/llvm-project/issues/56234

(cherry picked from commit ec1af63dde58c735fe60d6f2aafdb10fa93f410d)

Added: 


Modified: 
llvm/lib/Target/X86/X86MCInstLower.cpp
llvm/test/CodeGen/X86/patchable-prologue.ll

Removed: 




diff  --git a/llvm/lib/Target/X86/X86MCInstLower.cpp 
b/llvm/lib/Target/X86/X86MCInstLower.cpp
index 58ebe023cd61ec..7ce0aa22b99795 100644
--- a/llvm/lib/Target/X86/X86MCInstLower.cpp
+++ b/llvm/lib/Target/X86/X86MCInstLower.cpp
@@ -959,8 +959,10 @@ void X86AsmPrinter::LowerPATCHABLE_OP(const MachineInstr 
&MI,
   SmallString<256> Code;
   unsigned MinSize = MI.getOperand(0).getImm();
 
-  if (NextMI != MI.getParent()->end()) {
+  if (NextMI != MI.getParent()->end() && !NextMI->isInlineAsm()) {
 // Lower the next MachineInstr to find its byte size.
+// If the next instruction is inline assembly, we skip lowering it for now,
+// and assume we should always generate NOPs.
 MCInst MCI;
 MCIL.Lower(&*NextMI, MCI);
 

diff  --git a/llvm/test/CodeGen/X86/patchable-prologue.ll 
b/llvm/test/CodeGen/X86/patchable-prologue.ll
index 71a392845fdea3..43761e3d1e1eb9 100644
--- a/llvm/test/CodeGen/X86/patchable-prologue.ll
+++ b/llvm/test/CodeGen/X86/patchable-prologue.ll
@@ -193,3 +193,20 @@ do.body:  ; preds 
= %do.body, %entry
 do.end:   ; preds = %do.body
   ret void
 }
+
+
+; Test that inline asm is properly hotpatched. We currently don't examine the
+; asm instruction when printing it, thus we always emit patching NOPs.
+
+; 64: inline_asm:
+; 64-NEXT: # %bb.0:
+; 64-NEXT: xchgw   %ax, %ax# encoding: [0x66,0x90]
+; 64-NEXT: #APP
+; 64-NEXT: int3# encoding: [0xcc]
+; 64-NEXT: #NO_APP
+
+define dso_local void @inline_asm() 
"patchable-function"="prologue-short-redirect" {
+entry:
+  call void asm sideeffect "int3", "~{dirflag},~{fpsr},~{flags}"()
+  ret void
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [Codegen][X86] Fix /HOTPATCH with clang-cl and inline asm (#87639) (PR #88388)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/88388
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] release/18.x: [libc++] Fix -Wgnu-include-next in stddef.h (#88214) (PR #88419)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

Can we ignore this test failure?

https://github.com/llvm/llvm-project/pull/88419
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: github-upload-release.py: Fix bug preventing release creation (#84571) (PR #88425)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/88425

>From c24b41d71f2e5658b0b6618482831f34820f6b4a Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Fri, 8 Mar 2024 21:17:27 -0800
Subject: [PATCH] github-upload-release.py: Fix bug preventing release creation
 (#84571)

After aa02002491333c42060373bc84f1ff5d2c76b4ce we started passing the
user name to the create_release function and this was being interpreted
as the git tag.

(cherry picked from commit 0b9ce71a256d86c08f2b52ad2e337395b8f54b41)
---
 llvm/utils/release/github-upload-release.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/utils/release/github-upload-release.py 
b/llvm/utils/release/github-upload-release.py
index 14ec05062d88c8..8343dee937f78f 100755
--- a/llvm/utils/release/github-upload-release.py
+++ b/llvm/utils/release/github-upload-release.py
@@ -107,6 +107,6 @@ def upload_files(repo, release, files):
 sys.exit(1)
 
 if args.command == "create":
-create_release(llvm_repo, args.release, args.user)
+create_release(llvm_repo, args.release)
 if args.command == "upload":
 upload_files(llvm_repo, args.release, args.files)

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] c24b41d - github-upload-release.py: Fix bug preventing release creation (#84571)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

Author: Tom Stellard
Date: 2024-04-15T16:13:26-07:00
New Revision: c24b41d71f2e5658b0b6618482831f34820f6b4a

URL: 
https://github.com/llvm/llvm-project/commit/c24b41d71f2e5658b0b6618482831f34820f6b4a
DIFF: 
https://github.com/llvm/llvm-project/commit/c24b41d71f2e5658b0b6618482831f34820f6b4a.diff

LOG: github-upload-release.py: Fix bug preventing release creation (#84571)

After aa02002491333c42060373bc84f1ff5d2c76b4ce we started passing the
user name to the create_release function and this was being interpreted
as the git tag.

(cherry picked from commit 0b9ce71a256d86c08f2b52ad2e337395b8f54b41)

Added: 


Modified: 
llvm/utils/release/github-upload-release.py

Removed: 




diff  --git a/llvm/utils/release/github-upload-release.py 
b/llvm/utils/release/github-upload-release.py
index 14ec05062d88c8..8343dee937f78f 100755
--- a/llvm/utils/release/github-upload-release.py
+++ b/llvm/utils/release/github-upload-release.py
@@ -107,6 +107,6 @@ def upload_files(repo, release, files):
 sys.exit(1)
 
 if args.command == "create":
-create_release(llvm_repo, args.release, args.user)
+create_release(llvm_repo, args.release)
 if args.command == "upload":
 upload_files(llvm_repo, args.release, args.files)



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: github-upload-release.py: Fix bug preventing release creation (#84571) (PR #88425)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/88425
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/18.x: [RISCV] Support rv{32, 64}e in the compiler builtins (#88252) (PR #88525)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar updated 
https://github.com/llvm/llvm-project/pull/88525

>From eaae766a20fdd2d5f0c6b3f04d7f238a6aa1f814 Mon Sep 17 00:00:00 2001
From: Cyrill Leutwiler 
Date: Thu, 11 Apr 2024 07:11:51 +0200
Subject: [PATCH] [RISCV] Support rv{32, 64}e in the compiler builtins (#88252)

Register spills (save/restore) in RISC-V embedded work differently
because there are less registers and different stack alignment.

[GCC equivalent
](https://github.com/gcc-mirror/gcc/blob/master/libgcc/config/riscv/save-restore.S#L298C16-L336)

Follow up from #76777.

-

Signed-off-by: xermicus 
(cherry picked from commit bd32aaa8c9ec2094f605315b3989adc2a567ca98)
---
 compiler-rt/lib/builtins/riscv/restore.S | 42 
 compiler-rt/lib/builtins/riscv/save.S| 42 
 2 files changed, 84 insertions(+)

diff --git a/compiler-rt/lib/builtins/riscv/restore.S 
b/compiler-rt/lib/builtins/riscv/restore.S
index 73f64a920d6698..6f43842c8ca684 100644
--- a/compiler-rt/lib/builtins/riscv/restore.S
+++ b/compiler-rt/lib/builtins/riscv/restore.S
@@ -22,6 +22,8 @@
 
 #if __riscv_xlen == 32
 
+#ifndef __riscv_32e
+
   .globl  __riscv_restore_12
   .type   __riscv_restore_12,@function
 __riscv_restore_12:
@@ -86,8 +88,29 @@ __riscv_restore_0:
   addisp, sp, 16
   ret
 
+#else
+
+  .globl  __riscv_restore_2
+  .type   __riscv_restore_2,@function
+  .globl  __riscv_restore_1
+  .type   __riscv_restore_1,@function
+  .globl  __riscv_restore_0
+  .type   __riscv_restore_0,@function
+__riscv_restore_2:
+__riscv_restore_1:
+__riscv_restore_0:
+  lw  s1,  0(sp)
+  lw  s0,  4(sp)
+  lw  ra,  8(sp)
+  addisp, sp, 12
+  ret
+
+#endif
+
 #elif __riscv_xlen == 64
 
+#ifndef __riscv_64e
+
   .globl  __riscv_restore_12
   .type   __riscv_restore_12,@function
 __riscv_restore_12:
@@ -161,6 +184,25 @@ __riscv_restore_0:
   addisp, sp, 16
   ret
 
+#else
+
+  .globl  __riscv_restore_2
+  .type   __riscv_restore_2,@function
+  .globl  __riscv_restore_1
+  .type   __riscv_restore_1,@function
+  .globl  __riscv_restore_0
+  .type   __riscv_restore_0,@function
+__riscv_restore_2:
+__riscv_restore_1:
+__riscv_restore_0:
+  ld  s1,  0(sp)
+  ld  s0,  8(sp)
+  ld  ra,  16(sp)
+  addisp, sp, 24
+  ret
+
+#endif
+
 #else
 # error "xlen must be 32 or 64 for save-restore implementation
 #endif
diff --git a/compiler-rt/lib/builtins/riscv/save.S 
b/compiler-rt/lib/builtins/riscv/save.S
index 85501aeb4c2e93..3e044179ff7f1d 100644
--- a/compiler-rt/lib/builtins/riscv/save.S
+++ b/compiler-rt/lib/builtins/riscv/save.S
@@ -18,6 +18,8 @@
 
 #if __riscv_xlen == 32
 
+#ifndef __riscv_32e
+
   .globl  __riscv_save_12
   .type   __riscv_save_12,@function
 __riscv_save_12:
@@ -92,8 +94,29 @@ __riscv_save_0:
   sw  ra,  12(sp)
   jr  t0
 
+#else
+
+  .globl  __riscv_save_2
+  .type   __riscv_save_2,@function
+  .globl  __riscv_save_1
+  .type   __riscv_save_1,@function
+  .globl  __riscv_save_0
+  .type   __riscv_save_0,@function
+__riscv_save_2:
+__riscv_save_1:
+__riscv_save_0:
+  addisp, sp, -12
+  sw  s1,  0(sp)
+  sw  s0,  4(sp)
+  sw  ra,  8(sp)
+  jr  t0
+
+#endif
+
 #elif __riscv_xlen == 64
 
+#ifndef __riscv_64e
+
   .globl  __riscv_save_12
   .type   __riscv_save_12,@function
 __riscv_save_12:
@@ -181,6 +204,25 @@ __riscv_save_0:
   sd ra, 8(sp)
   jr t0
 
+#else
+
+  .globl  __riscv_save_2
+  .type   __riscv_save_2,@function
+  .globl  __riscv_save_1
+  .type   __riscv_save_1,@function
+  .globl  __riscv_save_0
+  .type   __riscv_save_0,@function
+__riscv_save_2:
+__riscv_save_1:
+__riscv_save_0:
+  addi   sp, sp, -24
+  sd s1, 0(sp)
+  sd s0, 8(sp)
+  sd ra, 16(sp)
+  jr t0
+
+#endif
+
 #else
 # error "xlen must be 32 or 64 for save-restore implementation
 #endif

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] eaae766 - [RISCV] Support rv{32, 64}e in the compiler builtins (#88252)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

Author: Cyrill Leutwiler
Date: 2024-04-15T16:18:14-07:00
New Revision: eaae766a20fdd2d5f0c6b3f04d7f238a6aa1f814

URL: 
https://github.com/llvm/llvm-project/commit/eaae766a20fdd2d5f0c6b3f04d7f238a6aa1f814
DIFF: 
https://github.com/llvm/llvm-project/commit/eaae766a20fdd2d5f0c6b3f04d7f238a6aa1f814.diff

LOG: [RISCV] Support rv{32, 64}e in the compiler builtins (#88252)

Register spills (save/restore) in RISC-V embedded work differently
because there are less registers and different stack alignment.

[GCC equivalent
](https://github.com/gcc-mirror/gcc/blob/master/libgcc/config/riscv/save-restore.S#L298C16-L336)

Follow up from #76777.

-

Signed-off-by: xermicus 
(cherry picked from commit bd32aaa8c9ec2094f605315b3989adc2a567ca98)

Added: 


Modified: 
compiler-rt/lib/builtins/riscv/restore.S
compiler-rt/lib/builtins/riscv/save.S

Removed: 




diff  --git a/compiler-rt/lib/builtins/riscv/restore.S 
b/compiler-rt/lib/builtins/riscv/restore.S
index 73f64a920d6698..6f43842c8ca684 100644
--- a/compiler-rt/lib/builtins/riscv/restore.S
+++ b/compiler-rt/lib/builtins/riscv/restore.S
@@ -22,6 +22,8 @@
 
 #if __riscv_xlen == 32
 
+#ifndef __riscv_32e
+
   .globl  __riscv_restore_12
   .type   __riscv_restore_12,@function
 __riscv_restore_12:
@@ -86,8 +88,29 @@ __riscv_restore_0:
   addisp, sp, 16
   ret
 
+#else
+
+  .globl  __riscv_restore_2
+  .type   __riscv_restore_2,@function
+  .globl  __riscv_restore_1
+  .type   __riscv_restore_1,@function
+  .globl  __riscv_restore_0
+  .type   __riscv_restore_0,@function
+__riscv_restore_2:
+__riscv_restore_1:
+__riscv_restore_0:
+  lw  s1,  0(sp)
+  lw  s0,  4(sp)
+  lw  ra,  8(sp)
+  addisp, sp, 12
+  ret
+
+#endif
+
 #elif __riscv_xlen == 64
 
+#ifndef __riscv_64e
+
   .globl  __riscv_restore_12
   .type   __riscv_restore_12,@function
 __riscv_restore_12:
@@ -161,6 +184,25 @@ __riscv_restore_0:
   addisp, sp, 16
   ret
 
+#else
+
+  .globl  __riscv_restore_2
+  .type   __riscv_restore_2,@function
+  .globl  __riscv_restore_1
+  .type   __riscv_restore_1,@function
+  .globl  __riscv_restore_0
+  .type   __riscv_restore_0,@function
+__riscv_restore_2:
+__riscv_restore_1:
+__riscv_restore_0:
+  ld  s1,  0(sp)
+  ld  s0,  8(sp)
+  ld  ra,  16(sp)
+  addisp, sp, 24
+  ret
+
+#endif
+
 #else
 # error "xlen must be 32 or 64 for save-restore implementation
 #endif

diff  --git a/compiler-rt/lib/builtins/riscv/save.S 
b/compiler-rt/lib/builtins/riscv/save.S
index 85501aeb4c2e93..3e044179ff7f1d 100644
--- a/compiler-rt/lib/builtins/riscv/save.S
+++ b/compiler-rt/lib/builtins/riscv/save.S
@@ -18,6 +18,8 @@
 
 #if __riscv_xlen == 32
 
+#ifndef __riscv_32e
+
   .globl  __riscv_save_12
   .type   __riscv_save_12,@function
 __riscv_save_12:
@@ -92,8 +94,29 @@ __riscv_save_0:
   sw  ra,  12(sp)
   jr  t0
 
+#else
+
+  .globl  __riscv_save_2
+  .type   __riscv_save_2,@function
+  .globl  __riscv_save_1
+  .type   __riscv_save_1,@function
+  .globl  __riscv_save_0
+  .type   __riscv_save_0,@function
+__riscv_save_2:
+__riscv_save_1:
+__riscv_save_0:
+  addisp, sp, -12
+  sw  s1,  0(sp)
+  sw  s0,  4(sp)
+  sw  ra,  8(sp)
+  jr  t0
+
+#endif
+
 #elif __riscv_xlen == 64
 
+#ifndef __riscv_64e
+
   .globl  __riscv_save_12
   .type   __riscv_save_12,@function
 __riscv_save_12:
@@ -181,6 +204,25 @@ __riscv_save_0:
   sd ra, 8(sp)
   jr t0
 
+#else
+
+  .globl  __riscv_save_2
+  .type   __riscv_save_2,@function
+  .globl  __riscv_save_1
+  .type   __riscv_save_1,@function
+  .globl  __riscv_save_0
+  .type   __riscv_save_0,@function
+__riscv_save_2:
+__riscv_save_1:
+__riscv_save_0:
+  addi   sp, sp, -24
+  sd s1, 0(sp)
+  sd s0, 8(sp)
+  sd ra, 16(sp)
+  jr t0
+
+#endif
+
 #else
 # error "xlen must be 32 or 64 for save-restore implementation
 #endif



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/18.x: [RISCV] Support rv{32, 64}e in the compiler builtins (#88252) (PR #88525)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/88525
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86] Fix miscompile in combineShiftRightArithmetic (PR #86728)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@phoebewang What do you think about backporting this?

https://github.com/llvm/llvm-project/pull/86728
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x [X86_64] fix SSE type error in vaarg (PR #86698)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@phoebewang What do you think about backporting this?

https://github.com/llvm/llvm-project/pull/86698
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x [SelectionDAG] Prevent combination on inconsistent type in 'carryDiamond' (PR #86697)

2024-04-15 Thread Tom Stellard via llvm-branch-commits

tstellar wrote:

@arsenm What do you think about backporting this?

https://github.com/llvm/llvm-project/pull/86697
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Revert "[Mips] Fix missing sign extension in expansion of sub-word atomic max (#77072)" (PR #88818)

2024-04-15 Thread Nikita Popov via llvm-branch-commits

https://github.com/nikic milestoned 
https://github.com/llvm/llvm-project/pull/88818
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Revert "[Mips] Fix missing sign extension in expansion of sub-word atomic max (#77072)" (PR #88818)

2024-04-15 Thread Nikita Popov via llvm-branch-commits

https://github.com/nikic created https://github.com/llvm/llvm-project/pull/88818

…omic max (#77072)"

These changes caused correctness regressions observed in Rust, see 
https://github.com/llvm/llvm-project/pull/77072#issuecomment-2049009507 and 
following. Revert the incorrect changes from the release branch.

This reverts commit 0e501dbd932ef1c6f4e747c83bf33beef0a09ecf.
This reverts commit fbb27d16fa12aa595cbd20a1fb5f1c5b80748fa4.

>From 8b6d4e5d2293ee529405988780d65f0700d6275a Mon Sep 17 00:00:00 2001
From: Nikita Popov 
Date: Tue, 16 Apr 2024 09:10:46 +0900
Subject: [PATCH] Revert "[Mips] Fix missing sign extension in expansion of
 sub-word atomic max (#77072)"

These changes caused correctness regressions observed in Rust,
see
https://github.com/llvm/llvm-project/pull/77072#issuecomment-2049009507.

This reverts commit 0e501dbd932ef1c6f4e747c83bf33beef0a09ecf.
This reverts commit fbb27d16fa12aa595cbd20a1fb5f1c5b80748fa4.
---
 llvm/lib/Target/Mips/MipsExpandPseudo.cpp |  60 +--
 llvm/test/CodeGen/Mips/atomic-min-max.ll  | 615 +++---
 2 files changed, 81 insertions(+), 594 deletions(-)

diff --git a/llvm/lib/Target/Mips/MipsExpandPseudo.cpp 
b/llvm/lib/Target/Mips/MipsExpandPseudo.cpp
index c30129743a9626..2c2554b5b4bc3b 100644
--- a/llvm/lib/Target/Mips/MipsExpandPseudo.cpp
+++ b/llvm/lib/Target/Mips/MipsExpandPseudo.cpp
@@ -388,32 +388,18 @@ bool MipsExpandPseudo::expandAtomicBinOpSubword(
 Opcode = Mips::XOR;
 break;
   case Mips::ATOMIC_LOAD_UMIN_I8_POSTRA:
-IsUnsigned = true;
-IsMin = true;
-break;
   case Mips::ATOMIC_LOAD_UMIN_I16_POSTRA:
 IsUnsigned = true;
-IsMin = true;
-break;
+[[fallthrough]];
   case Mips::ATOMIC_LOAD_MIN_I8_POSTRA:
-SEOp = Mips::SEB;
-IsMin = true;
-break;
   case Mips::ATOMIC_LOAD_MIN_I16_POSTRA:
 IsMin = true;
 break;
   case Mips::ATOMIC_LOAD_UMAX_I8_POSTRA:
-IsUnsigned = true;
-IsMax = true;
-break;
   case Mips::ATOMIC_LOAD_UMAX_I16_POSTRA:
 IsUnsigned = true;
-IsMax = true;
-break;
+[[fallthrough]];
   case Mips::ATOMIC_LOAD_MAX_I8_POSTRA:
-SEOp = Mips::SEB;
-IsMax = true;
-break;
   case Mips::ATOMIC_LOAD_MAX_I16_POSTRA:
 IsMax = true;
 break;
@@ -475,42 +461,14 @@ bool MipsExpandPseudo::expandAtomicBinOpSubword(
 
 // For little endian we need to clear uninterested bits.
 if (STI->isLittle()) {
-  if (!IsUnsigned) {
-BuildMI(loopMBB, DL, TII->get(Mips::SRAV), OldVal)
-.addReg(OldVal)
-.addReg(ShiftAmnt);
-BuildMI(loopMBB, DL, TII->get(Mips::SRAV), Incr)
-.addReg(Incr)
-.addReg(ShiftAmnt);
-if (STI->hasMips32r2()) {
-  BuildMI(loopMBB, DL, TII->get(SEOp), OldVal).addReg(OldVal);
-  BuildMI(loopMBB, DL, TII->get(SEOp), Incr).addReg(Incr);
-} else {
-  const unsigned ShiftImm = SEOp == Mips::SEH ? 16 : 24;
-  BuildMI(loopMBB, DL, TII->get(Mips::SLL), OldVal)
-  .addReg(OldVal, RegState::Kill)
-  .addImm(ShiftImm);
-  BuildMI(loopMBB, DL, TII->get(Mips::SRA), OldVal)
-  .addReg(OldVal, RegState::Kill)
-  .addImm(ShiftImm);
-  BuildMI(loopMBB, DL, TII->get(Mips::SLL), Incr)
-  .addReg(Incr, RegState::Kill)
-  .addImm(ShiftImm);
-  BuildMI(loopMBB, DL, TII->get(Mips::SRA), Incr)
-  .addReg(Incr, RegState::Kill)
-  .addImm(ShiftImm);
-}
-  } else {
-// and OldVal, OldVal, Mask
-// and Incr, Incr, Mask
-BuildMI(loopMBB, DL, TII->get(Mips::AND), OldVal)
-.addReg(OldVal)
-.addReg(Mask);
-BuildMI(loopMBB, DL, TII->get(Mips::AND), Incr)
-.addReg(Incr)
-.addReg(Mask);
-  }
+  // and OldVal, OldVal, Mask
+  // and Incr, Incr, Mask
+  BuildMI(loopMBB, DL, TII->get(Mips::AND), OldVal)
+  .addReg(OldVal)
+  .addReg(Mask);
+  BuildMI(loopMBB, DL, TII->get(Mips::AND), 
Incr).addReg(Incr).addReg(Mask);
 }
+
 // unsigned: sltu Scratch4, oldVal, Incr
 // signed:   slt Scratch4, oldVal, Incr
 BuildMI(loopMBB, DL, TII->get(SLTScratch4), Scratch4)
diff --git a/llvm/test/CodeGen/Mips/atomic-min-max.ll 
b/llvm/test/CodeGen/Mips/atomic-min-max.ll
index a96581bdb39a4c..f953c885ea7345 100644
--- a/llvm/test/CodeGen/Mips/atomic-min-max.ll
+++ b/llvm/test/CodeGen/Mips/atomic-min-max.ll
@@ -3,7 +3,6 @@
 ; RUN: llc -march=mips -O0 -mcpu=mips32r6 -verify-machineinstrs %s -o - | 
FileCheck %s --check-prefix=MIPSR6
 ; RUN: llc -march=mips -O0 -mcpu=mips32r2 -mattr=+micromips 
-verify-machineinstrs %s -o - | FileCheck %s --check-prefix=MM
 ; RUN: llc -march=mips -O0 -mcpu=mips32r6 -mattr=+micromips 
-verify-machineinstrs %s -o - | FileCheck %s --check-prefix=MMR6
-; RUN: llc -march=mipsel -O0 -mcpu=mips32 -verify-machineinstrs %s -o - | 
FileCheck %s --check-prefix=MIPS32
 ; RUN: llc -march=mipsel -O0 

[llvm-branch-commits] [llvm] Revert "[Mips] Fix missing sign extension in expansion of sub-word atomic max (#77072)" (PR #88818)

2024-04-15 Thread Quentin Dian via llvm-branch-commits

https://github.com/DianQK approved this pull request.

LGTM.

https://github.com/llvm/llvm-project/pull/88818
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Revert "[Mips] Fix missing sign extension in expansion of sub-word atomic max (#77072)" (PR #88818)

2024-04-15 Thread Nikita Popov via llvm-branch-commits

https://github.com/nikic edited https://github.com/llvm/llvm-project/pull/88818
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)

2024-04-15 Thread Chuanqi Xu via llvm-branch-commits

ChuanqiXu9 wrote:

> Sorry, still struggling to get a small repro. The build graphs we have are 
> quite large, unfortunately. Did any of the stack traces or error message I 
> posted help to find certain problems? Or is there no hope until we get a 
> smaller repro?

I tried to review the patch purely but it looks like not easy to fix the 
failures without reproducers... for performances, I am wondering if we can 
improve it by caching the hash results. But it will be helpful if we can have 
profile data for that.

A lot thanks for testing this.

https://github.com/llvm/llvm-project/pull/83237
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits