[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (#105671, #106723) (PR #106952)

2024-09-02 Thread Orlando Cazalet-Hyams via llvm-branch-commits

https://github.com/OCHyams created 
https://github.com/llvm/llvm-project/pull/106952

Please can we backport 43661a1214353ea1773a711f403f8d1118e9ca0f (and 
7ffe67c17c524c2d3056c0721a33c7012dce3061) into the next dot release.

Replaces #106691 - this one includes a follow-up fix in 
7ffe67c17c524c2d3056c0721a33c7012dce3061. I couldn't create two separate 
requests because it creates conflicts.

>From 70d00400165ee76199a1e4565fdebb6d84d2eec7 Mon Sep 17 00:00:00 2001
From: Orlando Cazalet-Hyams 
Date: Thu, 29 Aug 2024 14:12:02 +0100
Subject: [PATCH 1/2] [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case
 (#105671)

Fix #105571 which demonstrates an end() iterator dereference when
performing a non-empty splice to end() from a region that ends at
Src::end().

Rather than calling Instruction::adoptDbgRecords from Dest, create a marker
(which takes an iterator) and absorbDebugValues onto that. The "absorb" variant
doesn't clean up the source marker, which in this case we know is a trailing
marker, so we have to do that manually.

(cherry picked from commit 43661a1214353ea1773a711f403f8d1118e9ca0f)
---
 llvm/lib/IR/BasicBlock.cpp  | 12 -
 llvm/unittests/IR/BasicBlockDbgInfoTest.cpp | 54 +
 2 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/IR/BasicBlock.cpp b/llvm/lib/IR/BasicBlock.cpp
index 0a9498f051cb59..46896d3cdf7d50 100644
--- a/llvm/lib/IR/BasicBlock.cpp
+++ b/llvm/lib/IR/BasicBlock.cpp
@@ -975,8 +975,16 @@ void BasicBlock::spliceDebugInfoImpl(BasicBlock::iterator 
Dest, BasicBlock *Src,
   if (ReadFromTail && Src->getMarker(Last)) {
 DbgMarker *FromLast = Src->getMarker(Last);
 if (LastIsEnd) {
-  Dest->adoptDbgRecords(Src, Last, true);
-  // adoptDbgRecords will release any trailers.
+  if (Dest == end()) {
+// Abosrb the trailing markers from Src.
+assert(FromLast == Src->getTrailingDbgRecords());
+createMarker(Dest)->absorbDebugValues(*FromLast, true);
+FromLast->eraseFromParent();
+Src->deleteTrailingDbgRecords();
+  } else {
+// adoptDbgRecords will release any trailers.
+Dest->adoptDbgRecords(Src, Last, true);
+  }
   assert(!Src->getTrailingDbgRecords());
 } else {
   // FIXME: can we use adoptDbgRecords here to reduce allocations?
diff --git a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp 
b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
index 835780e63aaf4f..5615a4493d20a1 100644
--- a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
+++ b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
@@ -1525,4 +1525,58 @@ TEST(BasicBlockDbgInfoTest, DbgMoveToEnd) {
   EXPECT_FALSE(Ret->hasDbgRecords());
 }
 
+TEST(BasicBlockDbgInfoTest, CloneTrailingRecordsToEmptyBlock) {
+  LLVMContext C;
+  std::unique_ptr M = parseIR(C, R"(
+define i16 @foo(i16 %a) !dbg !6 {
+entry:
+  %b = add i16 %a, 0
+#dbg_value(i16 %b, !9, !DIExpression(), !11)
+  ret i16 0, !dbg !11
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!5}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: 
"debugify", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, 
enums: !2)
+!1 = !DIFile(filename: "t.ll", directory: "/")
+!2 = !{}
+!5 = !{i32 2, !"Debug Info Version", i32 3}
+!6 = distinct !DISubprogram(name: "foo", linkageName: "foo", scope: null, 
file: !1, line: 1, type: !7, scopeLine: 1, spFlags: DISPFlagDefinition | 
DISPFlagOptimized, unit: !0, retainedNodes: !8)
+!7 = !DISubroutineType(types: !2)
+!8 = !{!9}
+!9 = !DILocalVariable(name: "1", scope: !6, file: !1, line: 1, type: !10)
+!10 = !DIBasicType(name: "ty16", size: 16, encoding: DW_ATE_unsigned)
+!11 = !DILocation(line: 1, column: 1, scope: !6)
+)");
+  ASSERT_TRUE(M);
+
+  Function *F = M->getFunction("foo");
+  BasicBlock &BB = F->getEntryBlock();
+  // Start with no trailing records.
+  ASSERT_FALSE(BB.getTrailingDbgRecords());
+
+  BasicBlock::iterator Ret = std::prev(BB.end());
+  BasicBlock::iterator B = std::prev(Ret);
+
+  // Delete terminator which has debug records: we now get trailing records.
+  Ret->eraseFromParent();
+  EXPECT_TRUE(BB.getTrailingDbgRecords());
+
+  BasicBlock *NewBB = BasicBlock::Create(C, "NewBB", F);
+  NewBB->splice(NewBB->end(), &BB, B, BB.end());
+
+  // The trailing records should've been absorbed into NewBB.
+  EXPECT_FALSE(BB.getTrailingDbgRecords());
+  EXPECT_TRUE(NewBB->getTrailingDbgRecords());
+  if (NewBB->getTrailingDbgRecords()) {
+EXPECT_EQ(
+llvm::range_size(NewBB->getTrailingDbgRecords()->getDbgRecordRange()),
+1u);
+  }
+
+  // Drop the trailing records now, to prevent a cleanup assertion.
+  NewBB->deleteTrailingDbgRecords();
+}
+
 } // End anonymous namespace.

>From a0dadeb3a4bd7332046b9af811646b730eaec9d1 Mon Sep 17 00:00:00 2001
From: Orlando Cazalet-Hyams 
Date: Fri, 30 Aug 2024 13:44:42 +0100
Subject: [PATCH 2/2] [RemoveDIs] Fix asan-identified leak in 

[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (#105671, #106723) (PR #106952)

2024-09-02 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-ir

Author: Orlando Cazalet-Hyams (OCHyams)


Changes

Please can we backport 43661a1214353ea1773a711f403f8d1118e9ca0f (and 
7ffe67c17c524c2d3056c0721a33c7012dce3061) into the next dot release.

Replaces #106691 - this one includes a follow-up fix in 
7ffe67c17c524c2d3056c0721a33c7012dce3061. I couldn't create two separate 
requests because it creates conflicts.

---
Full diff: https://github.com/llvm/llvm-project/pull/106952.diff


2 Files Affected:

- (modified) llvm/lib/IR/BasicBlock.cpp (+10-2) 
- (modified) llvm/unittests/IR/BasicBlockDbgInfoTest.cpp (+52) 


``diff
diff --git a/llvm/lib/IR/BasicBlock.cpp b/llvm/lib/IR/BasicBlock.cpp
index 0a9498f051cb59..46896d3cdf7d50 100644
--- a/llvm/lib/IR/BasicBlock.cpp
+++ b/llvm/lib/IR/BasicBlock.cpp
@@ -975,8 +975,16 @@ void BasicBlock::spliceDebugInfoImpl(BasicBlock::iterator 
Dest, BasicBlock *Src,
   if (ReadFromTail && Src->getMarker(Last)) {
 DbgMarker *FromLast = Src->getMarker(Last);
 if (LastIsEnd) {
-  Dest->adoptDbgRecords(Src, Last, true);
-  // adoptDbgRecords will release any trailers.
+  if (Dest == end()) {
+// Abosrb the trailing markers from Src.
+assert(FromLast == Src->getTrailingDbgRecords());
+createMarker(Dest)->absorbDebugValues(*FromLast, true);
+FromLast->eraseFromParent();
+Src->deleteTrailingDbgRecords();
+  } else {
+// adoptDbgRecords will release any trailers.
+Dest->adoptDbgRecords(Src, Last, true);
+  }
   assert(!Src->getTrailingDbgRecords());
 } else {
   // FIXME: can we use adoptDbgRecords here to reduce allocations?
diff --git a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp 
b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
index 835780e63aaf4f..5ce14d3f6b9cef 100644
--- a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
+++ b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp
@@ -1525,4 +1525,56 @@ TEST(BasicBlockDbgInfoTest, DbgMoveToEnd) {
   EXPECT_FALSE(Ret->hasDbgRecords());
 }
 
+TEST(BasicBlockDbgInfoTest, CloneTrailingRecordsToEmptyBlock) {
+  LLVMContext C;
+  std::unique_ptr M = parseIR(C, R"(
+define i16 @foo(i16 %a) !dbg !6 {
+entry:
+  %b = add i16 %a, 0
+#dbg_value(i16 %b, !9, !DIExpression(), !11)
+  ret i16 0, !dbg !11
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!5}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: 
"debugify", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, 
enums: !2)
+!1 = !DIFile(filename: "t.ll", directory: "/")
+!2 = !{}
+!5 = !{i32 2, !"Debug Info Version", i32 3}
+!6 = distinct !DISubprogram(name: "foo", linkageName: "foo", scope: null, 
file: !1, line: 1, type: !7, scopeLine: 1, spFlags: DISPFlagDefinition | 
DISPFlagOptimized, unit: !0, retainedNodes: !8)
+!7 = !DISubroutineType(types: !2)
+!8 = !{!9}
+!9 = !DILocalVariable(name: "1", scope: !6, file: !1, line: 1, type: !10)
+!10 = !DIBasicType(name: "ty16", size: 16, encoding: DW_ATE_unsigned)
+!11 = !DILocation(line: 1, column: 1, scope: !6)
+)");
+  ASSERT_TRUE(M);
+
+  Function *F = M->getFunction("foo");
+  BasicBlock &BB = F->getEntryBlock();
+  // Start with no trailing records.
+  ASSERT_FALSE(BB.getTrailingDbgRecords());
+
+  BasicBlock::iterator Ret = std::prev(BB.end());
+  BasicBlock::iterator B = std::prev(Ret);
+
+  // Delete terminator which has debug records: we now get trailing records.
+  Ret->eraseFromParent();
+  EXPECT_TRUE(BB.getTrailingDbgRecords());
+
+  BasicBlock *NewBB = BasicBlock::Create(C, "NewBB", F);
+  NewBB->splice(NewBB->end(), &BB, B, BB.end());
+
+  // The trailing records should've been absorbed into NewBB.
+  EXPECT_FALSE(BB.getTrailingDbgRecords());
+  EXPECT_TRUE(NewBB->getTrailingDbgRecords());
+  if (DbgMarker *Trailing = NewBB->getTrailingDbgRecords()) {
+EXPECT_EQ(llvm::range_size(Trailing->getDbgRecordRange()), 1u);
+// Drop the trailing records now, to prevent a cleanup assertion.
+Trailing->eraseFromParent();
+NewBB->deleteTrailingDbgRecords();
+  }
+}
+
 } // End anonymous namespace.

``




https://github.com/llvm/llvm-project/pull/106952
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (#105671) (PR #106691)

2024-09-02 Thread Orlando Cazalet-Hyams via llvm-branch-commits

OCHyams wrote:

Re-opened backport request with the fix too - #106952 

https://github.com/llvm/llvm-project/pull/106691
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (#105671) (PR #106691)

2024-09-02 Thread Orlando Cazalet-Hyams via llvm-branch-commits

https://github.com/OCHyams closed 
https://github.com/llvm/llvm-project/pull/106691
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (#105671, #106723) (PR #106952)

2024-09-02 Thread Orlando Cazalet-Hyams via llvm-branch-commits

https://github.com/OCHyams edited 
https://github.com/llvm/llvm-project/pull/106952
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] c0f53ea - Revert "[RuntimeDyld][Windows] Allocate space for dllimport things. (#102586)"

2024-09-02 Thread via llvm-branch-commits

Author: Alastair Houghton
Date: 2024-09-02T10:24:44+01:00
New Revision: c0f53ea70d7886b6504aa787b834b8216a4b3367

URL: 
https://github.com/llvm/llvm-project/commit/c0f53ea70d7886b6504aa787b834b8216a4b3367
DIFF: 
https://github.com/llvm/llvm-project/commit/c0f53ea70d7886b6504aa787b834b8216a4b3367.diff

LOG: Revert "[RuntimeDyld][Windows] Allocate space for dllimport things. 
(#102586)"

This reverts commit a0a253181e3eb2e7173a37b043b82325c7cddd67.

Added: 


Modified: 
llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp
llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.h
llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h

Removed: 




diff  --git a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp 
b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
index 5ac5532705dc49..7eb7da0138c972 100644
--- a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
+++ b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
@@ -690,12 +690,9 @@ unsigned RuntimeDyldImpl::computeSectionStubBufSize(const 
ObjectFile &Obj,
 if (!(RelSecI == Section))
   continue;
 
-for (const RelocationRef &Reloc : SI->relocations()) {
+for (const RelocationRef &Reloc : SI->relocations())
   if (relocationNeedsStub(Reloc))
 StubBufSize += StubSize;
-  if (relocationNeedsDLLImportStub(Reloc))
-StubBufSize = sizeAfterAddingDLLImportStub(StubBufSize);
-}
   }
 
   // Get section data size and alignment

diff  --git a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp 
b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp
index 73b37ee0ff3311..25a2d8780fb56c 100644
--- a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp
+++ b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp
@@ -119,14 +119,4 @@ bool RuntimeDyldCOFF::isCompatibleFile(const 
object::ObjectFile &Obj) const {
   return Obj.isCOFF();
 }
 
-bool RuntimeDyldCOFF::relocationNeedsDLLImportStub(
-const RelocationRef &R) const {
-  object::symbol_iterator Symbol = R.getSymbol();
-  Expected TargetNameOrErr = Symbol->getName();
-  if (!TargetNameOrErr)
-return false;
-
-  return TargetNameOrErr->starts_with(getImportSymbolPrefix());
-}
-
 } // namespace llvm

diff  --git a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.h 
b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.h
index 51d177c7bb8bec..25e3783cf160b2 100644
--- a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.h
+++ b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.h
@@ -14,7 +14,6 @@
 #define LLVM_RUNTIME_DYLD_COFF_H
 
 #include "RuntimeDyldImpl.h"
-#include "llvm/Support/MathExtras.h"
 
 namespace llvm {
 
@@ -46,12 +45,6 @@ class RuntimeDyldCOFF : public RuntimeDyldImpl {
 
   static constexpr StringRef getImportSymbolPrefix() { return "__imp_"; }
 
-  bool relocationNeedsDLLImportStub(const RelocationRef &R) const;
-
-  unsigned sizeAfterAddingDLLImportStub(unsigned Size) const {
-return alignTo(Size, PointerSize) + PointerSize;
-  }
-
 private:
   unsigned PointerSize;
   uint32_t PointerReloc;

diff  --git a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h 
b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h
index de7630b9747ea4..e09c632842d6e9 100644
--- a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h
+++ b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h
@@ -455,16 +455,6 @@ class RuntimeDyldImpl {
 return true;// Conservative answer
   }
 
-  // Return true if the relocation R may require allocating a DLL import stub.
-  virtual bool relocationNeedsDLLImportStub(const RelocationRef &R) const {
-return false;
-  }
-
-  // Add the size of a DLL import stub to the buffer size
-  virtual unsigned sizeAfterAddingDLLImportStub(unsigned Size) const {
-return Size;
-  }
-
 public:
   RuntimeDyldImpl(RuntimeDyld::MemoryManager &MemMgr,
   JITSymbolResolver &Resolver)



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [LoongArch] Add TTI support for cpop with LSX (PR #106961)

2024-09-02 Thread via llvm-branch-commits

https://github.com/wangleiat created 
https://github.com/llvm/llvm-project/pull/106961

None


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [LoongArch] Add TTI support for cpop with LSX (PR #106961)

2024-09-02 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-loongarch

Author: wanglei (wangleiat)


Changes



---
Full diff: https://github.com/llvm/llvm-project/pull/106961.diff


4 Files Affected:

- (modified) llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp (+7) 
- (modified) llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.h (+1) 
- (added) llvm/test/Transforms/LoopIdiom/LoongArch/lit.local.cfg (+2) 
- (added) llvm/test/Transforms/LoopIdiom/LoongArch/popcnt.ll (+320) 


``diff
diff --git a/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp 
b/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp
index 2c7b0bfeaaad52..3b227fd7e4345c 100644
--- a/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp
@@ -83,4 +83,11 @@ const char *LoongArchTTIImpl::getRegisterClassName(unsigned 
ClassID) const {
   llvm_unreachable("unknown register class");
 }
 
+TargetTransformInfo::PopcntSupportKind
+LoongArchTTIImpl::getPopcntSupport(unsigned TyWidth) {
+  assert(isPowerOf2_32(TyWidth) && "Ty width must be power of 2");
+  llvm::errs() << "XXX: " << TyWidth << "\n";
+  return ST->hasExtLSX() ? TTI::PSK_FastHardware : TTI::PSK_Software;
+}
+
 // TODO: Implement more hooks to provide TTI machinery for LoongArch.
diff --git a/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.h 
b/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.h
index b2eef80dd9d3d2..f7ce75173be203 100644
--- a/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.h
+++ b/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.h
@@ -45,6 +45,7 @@ class LoongArchTTIImpl : public 
BasicTTIImplBase {
   unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const;
   unsigned getMaxInterleaveFactor(ElementCount VF);
   const char *getRegisterClassName(unsigned ClassID) const;
+  TTI::PopcntSupportKind getPopcntSupport(unsigned TyWidth);
 
   // TODO: Implement more hooks to provide TTI machinery for LoongArch.
 };
diff --git a/llvm/test/Transforms/LoopIdiom/LoongArch/lit.local.cfg 
b/llvm/test/Transforms/LoopIdiom/LoongArch/lit.local.cfg
new file mode 100644
index 00..cc24278acbb414
--- /dev/null
+++ b/llvm/test/Transforms/LoopIdiom/LoongArch/lit.local.cfg
@@ -0,0 +1,2 @@
+if not "LoongArch" in config.root.targets:
+config.unsupported = True
diff --git a/llvm/test/Transforms/LoopIdiom/LoongArch/popcnt.ll 
b/llvm/test/Transforms/LoopIdiom/LoongArch/popcnt.ll
new file mode 100644
index 00..7d0fd2ebee3e8d
--- /dev/null
+++ b/llvm/test/Transforms/LoopIdiom/LoongArch/popcnt.ll
@@ -0,0 +1,320 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -passes=loop-idiom -mtriple=loongarch32 -mattr=+lsx -S < %s | 
FileCheck %s --check-prefixes=CPOP
+; RUN: opt -passes=loop-idiom -mtriple=loongarch64 -mattr=+lsx -S < %s | 
FileCheck %s --check-prefixes=CPOP
+; RUN: opt -passes=loop-idiom -mtriple=loongarch32 -S < %s | FileCheck %s 
--check-prefixes=NOCPOP
+; RUN: opt -passes=loop-idiom -mtriple=loongarch64 -S < %s | FileCheck %s 
--check-prefixes=NOCPOP
+
+; Mostly copied from RISCV version.
+
+;To recognize this pattern:
+;int popcount(unsigned long long a) {
+;int c = 0;
+;while (a) {
+;c++;
+;a &= a - 1;
+;}
+;return c;
+;}
+;
+
+define i32 @popcount_i64(i64 %a) nounwind uwtable readnone ssp {
+; CPOP-LABEL: @popcount_i64(
+; CPOP-NEXT:  entry:
+; CPOP-NEXT:[[TMP0:%.*]] = call i64 @llvm.ctpop.i64(i64 [[A:%.*]])
+; CPOP-NEXT:[[TMP1:%.*]] = trunc i64 [[TMP0]] to i32
+; CPOP-NEXT:[[TMP2:%.*]] = icmp eq i32 [[TMP1]], 0
+; CPOP-NEXT:br i1 [[TMP2]], label [[WHILE_END:%.*]], label 
[[WHILE_BODY_PREHEADER:%.*]]
+; CPOP:   while.body.preheader:
+; CPOP-NEXT:br label [[WHILE_BODY:%.*]]
+; CPOP:   while.body:
+; CPOP-NEXT:[[TCPHI:%.*]] = phi i32 [ [[TMP1]], [[WHILE_BODY_PREHEADER]] 
], [ [[TCDEC:%.*]], [[WHILE_BODY]] ]
+; CPOP-NEXT:[[C_05:%.*]] = phi i32 [ [[INC:%.*]], [[WHILE_BODY]] ], [ 0, 
[[WHILE_BODY_PREHEADER]] ]
+; CPOP-NEXT:[[A_ADDR_04:%.*]] = phi i64 [ [[AND:%.*]], [[WHILE_BODY]] ], [ 
[[A]], [[WHILE_BODY_PREHEADER]] ]
+; CPOP-NEXT:[[INC]] = add nsw i32 [[C_05]], 1
+; CPOP-NEXT:[[SUB:%.*]] = add i64 [[A_ADDR_04]], -1
+; CPOP-NEXT:[[AND]] = and i64 [[SUB]], [[A_ADDR_04]]
+; CPOP-NEXT:[[TCDEC]] = sub nsw i32 [[TCPHI]], 1
+; CPOP-NEXT:[[TOBOOL:%.*]] = icmp sle i32 [[TCDEC]], 0
+; CPOP-NEXT:br i1 [[TOBOOL]], label [[WHILE_END_LOOPEXIT:%.*]], label 
[[WHILE_BODY]]
+; CPOP:   while.end.loopexit:
+; CPOP-NEXT:[[INC_LCSSA:%.*]] = phi i32 [ [[TMP1]], [[WHILE_BODY]] ]
+; CPOP-NEXT:br label [[WHILE_END]]
+; CPOP:   while.end:
+; CPOP-NEXT:[[C_0_LCSSA:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ 
[[INC_LCSSA]], [[WHILE_END_LOOPEXIT]] ]
+; CPOP-NEXT:ret i32 [[C_0_LCSSA]]
+;
+; NOCPOP-LABEL: @popcount_i64(
+; NOCPOP-NEXT:  entry:
+; NOCPOP-NEXT:[[TOBOOL3:%.*]] = icmp eq i64 [[A:%.*]],

[llvm-branch-commits] [llvm] [LoongArch] Add TTI support for cpop with LSX (PR #106961)

2024-09-02 Thread via llvm-branch-commits

https://github.com/wangleiat updated 
https://github.com/llvm/llvm-project/pull/106961

>From 456935df7a65147dce6fbb8da8e60094ed647161 Mon Sep 17 00:00:00 2001
From: wanglei 
Date: Mon, 2 Sep 2024 17:59:38 +0800
Subject: [PATCH] remove debug msg

Created using spr 1.3.5-bogner
---
 llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp | 1 -
 1 file changed, 1 deletion(-)

diff --git a/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp 
b/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp
index 3b227fd7e4345c..5fbc7c734168d1 100644
--- a/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp
+++ b/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp
@@ -86,7 +86,6 @@ const char *LoongArchTTIImpl::getRegisterClassName(unsigned 
ClassID) const {
 TargetTransformInfo::PopcntSupportKind
 LoongArchTTIImpl::getPopcntSupport(unsigned TyWidth) {
   assert(isPowerOf2_32(TyWidth) && "Ty width must be power of 2");
-  llvm::errs() << "XXX: " << TyWidth << "\n";
   return ST->hasExtLSX() ? TTI::PSK_FastHardware : TTI::PSK_Software;
 }
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)

2024-09-02 Thread Fabian Ritter via llvm-branch-commits

https://github.com/ritter-x2a created 
https://github.com/llvm/llvm-project/pull/106965

When a pointer to thread-local storage is passed in a function call,
ISel first lowers the call and wraps the resulting code in CALLSEQ
markers. Afterwards, to compute the pointer to TLS, a call to retrieve
the TLS base address is generated and then wrapped in a set of CALLSEQ
markers. If the latter call is inserted into the call sequence of the
former call, this leads to nested call frames, which are illegal and
lead to errors in the machine verifier.

This patch avoids surrounding the call to compute the TLS base address
in CALLSEQ markers if it is already surrounded by such markers. It
relies on zero-sized call frames being represented in the call frame
size info stored in the MachineBBs.

Fixes #45574 and #98042.

>From 7159933bbf635490b2c4b9daea99d33373b6c2de Mon Sep 17 00:00:00 2001
From: Fabian Ritter 
Date: Mon, 2 Sep 2024 05:37:33 -0400
Subject: [PATCH] [X86] Avoid generating nested CALLSEQ for TLS pointer
 function arguments

When a pointer to thread-local storage is passed in a function call,
ISel first lowers the call and wraps the resulting code in CALLSEQ
markers. Afterwards, to compute the pointer to TLS, a call to retrieve
the TLS base address is generated and then wrapped in a set of CALLSEQ
markers. If the latter call is inserted into the call sequence of the
former call, this leads to nested call frames, which are illegal and
lead to errors in the machine verifier.

This patch avoids surrounding the call to compute the TLS base address
in CALLSEQ markers if it is already surrounded by such markers. It
relies on zero-sized call frames being represented in the call frame
size info stored in the MachineBBs.

Fixes #45574 and #98042.
---
 llvm/lib/Target/X86/X86ISelLowering.cpp|  7 +++
 llvm/test/CodeGen/X86/tls-function-argument.ll | 17 +
 2 files changed, 24 insertions(+)
 create mode 100644 llvm/test/CodeGen/X86/tls-function-argument.ll

diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index bbee0af109c74b..bf9777888df831 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -35593,6 +35593,13 @@ X86TargetLowering::EmitLoweredTLSAddr(MachineInstr &MI,
   // inside MC, therefore without the two markers shrink-wrapping
   // may push the prologue/epilogue pass them.
   const TargetInstrInfo &TII = *Subtarget.getInstrInfo();
+
+  // Do not introduce CALLSEQ markers if we are already in a call sequence.
+  // Nested call sequences are not allowed and cause errors in the machine
+  // verifier.
+  if (TII.getCallFrameSizeAt(MI).has_value())
+return BB;
+
   const MIMetadata MIMD(MI);
   MachineFunction &MF = *BB->getParent();
 
diff --git a/llvm/test/CodeGen/X86/tls-function-argument.ll 
b/llvm/test/CodeGen/X86/tls-function-argument.ll
new file mode 100644
index 00..ec2d664fc6b96f
--- /dev/null
+++ b/llvm/test/CodeGen/X86/tls-function-argument.ll
@@ -0,0 +1,17 @@
+; RUN: llc -verify-machineinstrs < %s -relocation-model=pic
+
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; Passing a pointer to thread-local storage to a function can be problematic
+; since computing such addresses requires a function call that is introduced
+; very late in instruction selection. We need to ensure that we don't introduce
+; nested call sequence markers if this function call happens in a call 
sequence.
+
+@TLS = internal thread_local global i64 zeroinitializer, align 8
+declare void @bar(ptr)
+define internal void @foo() {
+call void @bar(ptr @TLS)
+call void @bar(ptr @TLS)
+ret void
+}
\ No newline at end of file

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)

2024-09-02 Thread Fabian Ritter via llvm-branch-commits

ritter-x2a wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/106965?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#106965** https://app.graphite.dev/github/pr/llvm/llvm-project/106965?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#106964** https://app.graphite.dev/github/pr/llvm/llvm-project/106964?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @ritter-x2a and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/106965
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)

2024-09-02 Thread Fabian Ritter via llvm-branch-commits

https://github.com/ritter-x2a ready_for_review 
https://github.com/llvm/llvm-project/pull/106965
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)

2024-09-02 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-x86

Author: Fabian Ritter (ritter-x2a)


Changes

When a pointer to thread-local storage is passed in a function call,
ISel first lowers the call and wraps the resulting code in CALLSEQ
markers. Afterwards, to compute the pointer to TLS, a call to retrieve
the TLS base address is generated and then wrapped in a set of CALLSEQ
markers. If the latter call is inserted into the call sequence of the
former call, this leads to nested call frames, which are illegal and
lead to errors in the machine verifier.

This patch avoids surrounding the call to compute the TLS base address
in CALLSEQ markers if it is already surrounded by such markers. It
relies on zero-sized call frames being represented in the call frame
size info stored in the MachineBBs.

Fixes #45574 and #98042.

---
Full diff: https://github.com/llvm/llvm-project/pull/106965.diff


2 Files Affected:

- (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+7) 
- (added) llvm/test/CodeGen/X86/tls-function-argument.ll (+17) 


``diff
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index bbee0af109c74b..bf9777888df831 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -35593,6 +35593,13 @@ X86TargetLowering::EmitLoweredTLSAddr(MachineInstr &MI,
   // inside MC, therefore without the two markers shrink-wrapping
   // may push the prologue/epilogue pass them.
   const TargetInstrInfo &TII = *Subtarget.getInstrInfo();
+
+  // Do not introduce CALLSEQ markers if we are already in a call sequence.
+  // Nested call sequences are not allowed and cause errors in the machine
+  // verifier.
+  if (TII.getCallFrameSizeAt(MI).has_value())
+return BB;
+
   const MIMetadata MIMD(MI);
   MachineFunction &MF = *BB->getParent();
 
diff --git a/llvm/test/CodeGen/X86/tls-function-argument.ll 
b/llvm/test/CodeGen/X86/tls-function-argument.ll
new file mode 100644
index 00..ec2d664fc6b96f
--- /dev/null
+++ b/llvm/test/CodeGen/X86/tls-function-argument.ll
@@ -0,0 +1,17 @@
+; RUN: llc -verify-machineinstrs < %s -relocation-model=pic
+
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; Passing a pointer to thread-local storage to a function can be problematic
+; since computing such addresses requires a function call that is introduced
+; very late in instruction selection. We need to ensure that we don't introduce
+; nested call sequence markers if this function call happens in a call 
sequence.
+
+@TLS = internal thread_local global i64 zeroinitializer, align 8
+declare void @bar(ptr)
+define internal void @foo() {
+call void @bar(ptr @TLS)
+call void @bar(ptr @TLS)
+ret void
+}
\ No newline at end of file

``




https://github.com/llvm/llvm-project/pull/106965
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AVR] Fix parsing & emitting relative jumps (#106722) (PR #106729)

2024-09-02 Thread Patryk Wychowaniec via llvm-branch-commits

Patryk27 wrote:

Can I somehow help here? Usually I'd cherry-pick changes from 
https://github.com/llvm/llvm-project/pull/106739 into here myself, but since 
I'm not the author of the pull request, I can't modify it 👀 

(and we'd probably like to avoid having two separate backport pull requests to 
avoid breaking the branch in the meantime)

https://github.com/llvm/llvm-project/pull/106729
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: Restrict LLVM_TARGETS_TO_BUILD in Windows release packaging (#106059) (PR #106546)

2024-09-02 Thread via llvm-branch-commits

zmodem wrote:

> @zmodem (or anyone else). If you would like to add a note about this fix in 
> the release notes (completely optional). Please reply to this comment with a 
> one or two sentence description of the fix. When you are done, please add the 
> release:note label to this PR.

How about:

"Starting with LLVM 19, the Windows installers only include support for the 
X86, ARM, and AArch64 targets in order to keep the build size within the limits 
of the NSIS installer framework."

https://github.com/llvm/llvm-project/pull/106546
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Fix sign confusion in performMulLoHiCombine (PR #106977)

2024-09-02 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad milestoned 
https://github.com/llvm/llvm-project/pull/106977
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Fix sign confusion in performMulLoHiCombine (PR #106977)

2024-09-02 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad created 
https://github.com/llvm/llvm-project/pull/106977

SMUL_LOHI and UMUL_LOHI are different operations because the high part of the 
result is different, so it is not OK to optimize the signed version to 
MUL_U24/MULHI_U24 or the unsigned version to MUL_I24/MULHI_I24.

>From 04226baceb4e2823a7ca3daac236f705b3c6c33e Mon Sep 17 00:00:00 2001
From: Jay Foad 
Date: Tue, 27 Aug 2024 17:09:40 +0100
Subject: [PATCH] [AMDGPU] Fix sign confusion in performMulLoHiCombine
 (#105831)

SMUL_LOHI and UMUL_LOHI are different operations because the high part
of the result is different, so it is not OK to optimize the signed
version to MUL_U24/MULHI_U24 or the unsigned version to
MUL_I24/MULHI_I24.
---
 llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | 30 +++---
 llvm/test/CodeGen/AMDGPU/mul_int24.ll | 98 +++
 2 files changed, 116 insertions(+), 12 deletions(-)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
index 39ae7c96cf7729..a71c9453d968dd 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
@@ -4349,6 +4349,7 @@ AMDGPUTargetLowering::performMulLoHiCombine(SDNode *N,
   SelectionDAG &DAG = DCI.DAG;
   SDLoc DL(N);
 
+  bool Signed = N->getOpcode() == ISD::SMUL_LOHI;
   SDValue N0 = N->getOperand(0);
   SDValue N1 = N->getOperand(1);
 
@@ -4363,20 +4364,25 @@ AMDGPUTargetLowering::performMulLoHiCombine(SDNode *N,
 
   // Try to use two fast 24-bit multiplies (one for each half of the result)
   // instead of one slow extending multiply.
-  unsigned LoOpcode, HiOpcode;
-  if (Subtarget->hasMulU24() && isU24(N0, DAG) && isU24(N1, DAG)) {
-N0 = DAG.getZExtOrTrunc(N0, DL, MVT::i32);
-N1 = DAG.getZExtOrTrunc(N1, DL, MVT::i32);
-LoOpcode = AMDGPUISD::MUL_U24;
-HiOpcode = AMDGPUISD::MULHI_U24;
-  } else if (Subtarget->hasMulI24() && isI24(N0, DAG) && isI24(N1, DAG)) {
-N0 = DAG.getSExtOrTrunc(N0, DL, MVT::i32);
-N1 = DAG.getSExtOrTrunc(N1, DL, MVT::i32);
-LoOpcode = AMDGPUISD::MUL_I24;
-HiOpcode = AMDGPUISD::MULHI_I24;
+  unsigned LoOpcode = 0;
+  unsigned HiOpcode = 0;
+  if (Signed) {
+if (Subtarget->hasMulI24() && isI24(N0, DAG) && isI24(N1, DAG)) {
+  N0 = DAG.getSExtOrTrunc(N0, DL, MVT::i32);
+  N1 = DAG.getSExtOrTrunc(N1, DL, MVT::i32);
+  LoOpcode = AMDGPUISD::MUL_I24;
+  HiOpcode = AMDGPUISD::MULHI_I24;
+}
   } else {
-return SDValue();
+if (Subtarget->hasMulU24() && isU24(N0, DAG) && isU24(N1, DAG)) {
+  N0 = DAG.getZExtOrTrunc(N0, DL, MVT::i32);
+  N1 = DAG.getZExtOrTrunc(N1, DL, MVT::i32);
+  LoOpcode = AMDGPUISD::MUL_U24;
+  HiOpcode = AMDGPUISD::MULHI_U24;
+}
   }
+  if (!LoOpcode)
+return SDValue();
 
   SDValue Lo = DAG.getNode(LoOpcode, DL, MVT::i32, N0, N1);
   SDValue Hi = DAG.getNode(HiOpcode, DL, MVT::i32, N0, N1);
diff --git a/llvm/test/CodeGen/AMDGPU/mul_int24.ll 
b/llvm/test/CodeGen/AMDGPU/mul_int24.ll
index be77a10380c49b..8f4c48fae6fb31 100644
--- a/llvm/test/CodeGen/AMDGPU/mul_int24.ll
+++ b/llvm/test/CodeGen/AMDGPU/mul_int24.ll
@@ -813,4 +813,102 @@ bb7:
   ret void
 
 }
+
+define amdgpu_kernel void @test_umul_i24(ptr addrspace(1) %out, i32 %arg) {
+; SI-LABEL: test_umul_i24:
+; SI:   ; %bb.0:
+; SI-NEXT:s_load_dword s1, s[2:3], 0xb
+; SI-NEXT:v_mov_b32_e32 v0, 0xff803fe1
+; SI-NEXT:s_mov_b32 s0, 0
+; SI-NEXT:s_mov_b32 s3, 0xf000
+; SI-NEXT:s_waitcnt lgkmcnt(0)
+; SI-NEXT:s_lshr_b32 s1, s1, 9
+; SI-NEXT:v_mul_hi_u32 v0, s1, v0
+; SI-NEXT:s_mul_i32 s1, s1, 0xff803fe1
+; SI-NEXT:v_alignbit_b32 v0, v0, s1, 1
+; SI-NEXT:s_mov_b32 s2, -1
+; SI-NEXT:s_mov_b32 s1, s0
+; SI-NEXT:buffer_store_dword v0, off, s[0:3], 0
+; SI-NEXT:s_endpgm
+;
+; VI-LABEL: test_umul_i24:
+; VI:   ; %bb.0:
+; VI-NEXT:s_load_dword s0, s[2:3], 0x2c
+; VI-NEXT:v_mov_b32_e32 v0, 0xff803fe1
+; VI-NEXT:s_mov_b32 s3, 0xf000
+; VI-NEXT:s_mov_b32 s2, -1
+; VI-NEXT:s_waitcnt lgkmcnt(0)
+; VI-NEXT:s_lshr_b32 s0, s0, 9
+; VI-NEXT:v_mad_u64_u32 v[0:1], s[0:1], s0, v0, 0
+; VI-NEXT:s_mov_b32 s0, 0
+; VI-NEXT:s_mov_b32 s1, s0
+; VI-NEXT:v_alignbit_b32 v0, v1, v0, 1
+; VI-NEXT:s_nop 1
+; VI-NEXT:buffer_store_dword v0, off, s[0:3], 0
+; VI-NEXT:s_endpgm
+;
+; GFX9-LABEL: test_umul_i24:
+; GFX9:   ; %bb.0:
+; GFX9-NEXT:s_load_dword s1, s[2:3], 0x2c
+; GFX9-NEXT:s_mov_b32 s0, 0
+; GFX9-NEXT:s_mov_b32 s3, 0xf000
+; GFX9-NEXT:s_mov_b32 s2, -1
+; GFX9-NEXT:s_waitcnt lgkmcnt(0)
+; GFX9-NEXT:s_lshr_b32 s1, s1, 9
+; GFX9-NEXT:s_mul_hi_u32 s4, s1, 0xff803fe1
+; GFX9-NEXT:s_mul_i32 s1, s1, 0xff803fe1
+; GFX9-NEXT:v_mov_b32_e32 v0, s1
+; GFX9-NEXT:v_alignbit_b32 v0, s4, v0, 1
+; GFX9-NEXT:s_mov_b32 s1, s0
+; GFX9-NEXT:buffer_store_dword v0, off, s[0:3], 0
+; GFX9-NEXT:s_endpgm
+;
+; EG-LABEL: test_umul_i24:
+; EG:   ; %bb.0:
+; EG-

[llvm-branch-commits] [llvm] [AMDGPU] Fix sign confusion in performMulLoHiCombine (PR #106977)

2024-09-02 Thread Jay Foad via llvm-branch-commits

jayfoad wrote:

This is a backport of #105831.

https://github.com/llvm/llvm-project/pull/106977
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Fix sign confusion in performMulLoHiCombine (PR #106977)

2024-09-02 Thread Jay Foad via llvm-branch-commits

https://github.com/jayfoad edited 
https://github.com/llvm/llvm-project/pull/106977
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU] Fix sign confusion in performMulLoHiCombine (PR #106977)

2024-09-02 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Jay Foad (jayfoad)


Changes

SMUL_LOHI and UMUL_LOHI are different operations because the high part of the 
result is different, so it is not OK to optimize the signed version to 
MUL_U24/MULHI_U24 or the unsigned version to MUL_I24/MULHI_I24.

---
Full diff: https://github.com/llvm/llvm-project/pull/106977.diff


2 Files Affected:

- (modified) llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp (+18-12) 
- (modified) llvm/test/CodeGen/AMDGPU/mul_int24.ll (+98) 


``diff
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp 
b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
index 39ae7c96cf7729..a71c9453d968dd 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
@@ -4349,6 +4349,7 @@ AMDGPUTargetLowering::performMulLoHiCombine(SDNode *N,
   SelectionDAG &DAG = DCI.DAG;
   SDLoc DL(N);
 
+  bool Signed = N->getOpcode() == ISD::SMUL_LOHI;
   SDValue N0 = N->getOperand(0);
   SDValue N1 = N->getOperand(1);
 
@@ -4363,20 +4364,25 @@ AMDGPUTargetLowering::performMulLoHiCombine(SDNode *N,
 
   // Try to use two fast 24-bit multiplies (one for each half of the result)
   // instead of one slow extending multiply.
-  unsigned LoOpcode, HiOpcode;
-  if (Subtarget->hasMulU24() && isU24(N0, DAG) && isU24(N1, DAG)) {
-N0 = DAG.getZExtOrTrunc(N0, DL, MVT::i32);
-N1 = DAG.getZExtOrTrunc(N1, DL, MVT::i32);
-LoOpcode = AMDGPUISD::MUL_U24;
-HiOpcode = AMDGPUISD::MULHI_U24;
-  } else if (Subtarget->hasMulI24() && isI24(N0, DAG) && isI24(N1, DAG)) {
-N0 = DAG.getSExtOrTrunc(N0, DL, MVT::i32);
-N1 = DAG.getSExtOrTrunc(N1, DL, MVT::i32);
-LoOpcode = AMDGPUISD::MUL_I24;
-HiOpcode = AMDGPUISD::MULHI_I24;
+  unsigned LoOpcode = 0;
+  unsigned HiOpcode = 0;
+  if (Signed) {
+if (Subtarget->hasMulI24() && isI24(N0, DAG) && isI24(N1, DAG)) {
+  N0 = DAG.getSExtOrTrunc(N0, DL, MVT::i32);
+  N1 = DAG.getSExtOrTrunc(N1, DL, MVT::i32);
+  LoOpcode = AMDGPUISD::MUL_I24;
+  HiOpcode = AMDGPUISD::MULHI_I24;
+}
   } else {
-return SDValue();
+if (Subtarget->hasMulU24() && isU24(N0, DAG) && isU24(N1, DAG)) {
+  N0 = DAG.getZExtOrTrunc(N0, DL, MVT::i32);
+  N1 = DAG.getZExtOrTrunc(N1, DL, MVT::i32);
+  LoOpcode = AMDGPUISD::MUL_U24;
+  HiOpcode = AMDGPUISD::MULHI_U24;
+}
   }
+  if (!LoOpcode)
+return SDValue();
 
   SDValue Lo = DAG.getNode(LoOpcode, DL, MVT::i32, N0, N1);
   SDValue Hi = DAG.getNode(HiOpcode, DL, MVT::i32, N0, N1);
diff --git a/llvm/test/CodeGen/AMDGPU/mul_int24.ll 
b/llvm/test/CodeGen/AMDGPU/mul_int24.ll
index be77a10380c49b..8f4c48fae6fb31 100644
--- a/llvm/test/CodeGen/AMDGPU/mul_int24.ll
+++ b/llvm/test/CodeGen/AMDGPU/mul_int24.ll
@@ -813,4 +813,102 @@ bb7:
   ret void
 
 }
+
+define amdgpu_kernel void @test_umul_i24(ptr addrspace(1) %out, i32 %arg) {
+; SI-LABEL: test_umul_i24:
+; SI:   ; %bb.0:
+; SI-NEXT:s_load_dword s1, s[2:3], 0xb
+; SI-NEXT:v_mov_b32_e32 v0, 0xff803fe1
+; SI-NEXT:s_mov_b32 s0, 0
+; SI-NEXT:s_mov_b32 s3, 0xf000
+; SI-NEXT:s_waitcnt lgkmcnt(0)
+; SI-NEXT:s_lshr_b32 s1, s1, 9
+; SI-NEXT:v_mul_hi_u32 v0, s1, v0
+; SI-NEXT:s_mul_i32 s1, s1, 0xff803fe1
+; SI-NEXT:v_alignbit_b32 v0, v0, s1, 1
+; SI-NEXT:s_mov_b32 s2, -1
+; SI-NEXT:s_mov_b32 s1, s0
+; SI-NEXT:buffer_store_dword v0, off, s[0:3], 0
+; SI-NEXT:s_endpgm
+;
+; VI-LABEL: test_umul_i24:
+; VI:   ; %bb.0:
+; VI-NEXT:s_load_dword s0, s[2:3], 0x2c
+; VI-NEXT:v_mov_b32_e32 v0, 0xff803fe1
+; VI-NEXT:s_mov_b32 s3, 0xf000
+; VI-NEXT:s_mov_b32 s2, -1
+; VI-NEXT:s_waitcnt lgkmcnt(0)
+; VI-NEXT:s_lshr_b32 s0, s0, 9
+; VI-NEXT:v_mad_u64_u32 v[0:1], s[0:1], s0, v0, 0
+; VI-NEXT:s_mov_b32 s0, 0
+; VI-NEXT:s_mov_b32 s1, s0
+; VI-NEXT:v_alignbit_b32 v0, v1, v0, 1
+; VI-NEXT:s_nop 1
+; VI-NEXT:buffer_store_dword v0, off, s[0:3], 0
+; VI-NEXT:s_endpgm
+;
+; GFX9-LABEL: test_umul_i24:
+; GFX9:   ; %bb.0:
+; GFX9-NEXT:s_load_dword s1, s[2:3], 0x2c
+; GFX9-NEXT:s_mov_b32 s0, 0
+; GFX9-NEXT:s_mov_b32 s3, 0xf000
+; GFX9-NEXT:s_mov_b32 s2, -1
+; GFX9-NEXT:s_waitcnt lgkmcnt(0)
+; GFX9-NEXT:s_lshr_b32 s1, s1, 9
+; GFX9-NEXT:s_mul_hi_u32 s4, s1, 0xff803fe1
+; GFX9-NEXT:s_mul_i32 s1, s1, 0xff803fe1
+; GFX9-NEXT:v_mov_b32_e32 v0, s1
+; GFX9-NEXT:v_alignbit_b32 v0, s4, v0, 1
+; GFX9-NEXT:s_mov_b32 s1, s0
+; GFX9-NEXT:buffer_store_dword v0, off, s[0:3], 0
+; GFX9-NEXT:s_endpgm
+;
+; EG-LABEL: test_umul_i24:
+; EG:   ; %bb.0:
+; EG-NEXT:ALU 8, @4, KC0[CB0:0-32], KC1[]
+; EG-NEXT:MEM_RAT_CACHELESS STORE_RAW T0.X, T1.X, 1
+; EG-NEXT:CF_END
+; EG-NEXT:PAD
+; EG-NEXT:ALU clause starting at 4:
+; EG-NEXT: LSHR * T0.W, KC0[2].Z, literal.x,
+; EG-NEXT:9(1.261169e-44), 0(0.00e+00)
+; EG-NEXT: MULHI * T0.X, PV.W, literal.x,
+; EG-NEXT:-8372255(nan), 0(0.00e+0

[compiler-rt] release/19.x: [compiler-rt][fuzzer] SetThreadName build fix for Mingw… (PR #106908)

2024-09-02 Thread Martin Storsjö via llvm-branch-commits

https://github.com/mstorsjo requested changes to this pull request.

Do not backport this. This breaks builds in mingw environments that don't use 
winpthreads!

https://github.com/llvm/llvm-project/pull/106908
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/19.x: [compiler-rt][fuzzer] SetThreadName build fix for Mingw… (PR #106908)

2024-09-02 Thread David CARLIER via llvm-branch-commits

https://github.com/devnexen closed 
https://github.com/llvm/llvm-project/pull/106908
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: Win release packaging: Don't try to use rpmalloc for 32-bit x86 (#106969) (PR #106985)

2024-09-02 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/106985

Backport ef26afcb88dcb5f2de79bfc3cf88a8ea10f230ec

Requested by: @zmodem

>From ca30c4312e13edb54fa2f4288f7e430854671d19 Mon Sep 17 00:00:00 2001
From: Hans 
Date: Mon, 2 Sep 2024 15:04:13 +0200
Subject: [PATCH] Win release packaging: Don't try to use rpmalloc for 32-bit
 x86 (#106969)

because that doesn't work (results in `LINK : error LNK2001: unresolved
external symbol malloc`).
Based on the title of #91862 it was only intended for use in 64-bit
builds.

(cherry picked from commit ef26afcb88dcb5f2de79bfc3cf88a8ea10f230ec)
---
 llvm/utils/release/build_llvm_release.bat | 1 +
 1 file changed, 1 insertion(+)

diff --git a/llvm/utils/release/build_llvm_release.bat 
b/llvm/utils/release/build_llvm_release.bat
index 64ae2d41ab2b02..3508748c1d5404 100755
--- a/llvm/utils/release/build_llvm_release.bat
+++ b/llvm/utils/release/build_llvm_release.bat
@@ -193,6 +193,7 @@ REM Stage0 binaries directory; used in stage1.
 set "stage0_bin_dir=%build_dir%/build32_stage0/bin"
 set cmake_flags=^
   %common_cmake_flags% ^
+  -DLLVM_ENABLE_RPMALLOC=OFF ^
   -DLLDB_TEST_COMPILER=%stage0_bin_dir%/clang.exe ^
   -DPYTHON_HOME=%PYTHONHOME% ^
   -DPython3_ROOT_DIR=%PYTHONHOME% ^

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: Win release packaging: Don't try to use rpmalloc for 32-bit x86 (#106969) (PR #106985)

2024-09-02 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/106985
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: Win release packaging: Don't try to use rpmalloc for 32-bit x86 (#106969) (PR #106985)

2024-09-02 Thread via llvm-branch-commits

llvmbot wrote:

@aganea What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/106985
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: Win release packaging: Don't try to use rpmalloc for 32-bit x86 (#106969) (PR #106985)

2024-09-02 Thread Alexandre Ganea via llvm-branch-commits

https://github.com/aganea approved this pull request.


https://github.com/llvm/llvm-project/pull/106985
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] release/19.x: [clangd] Update TidyFastChecks for release/19.x (#106354) (PR #106989)

2024-09-02 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/106989
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] release/19.x: [clangd] Update TidyFastChecks for release/19.x (#106354) (PR #106989)

2024-09-02 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/106989

Backport b47d7ce8121b1cb1923e879d58eaa1d63aeaaae2

Requested by: @kadircet

>From 0de791716d55892ea1872abfec078e4e07bccb19 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?kadir=20=C3=A7etinkaya?= 
Date: Mon, 2 Sep 2024 15:25:26 +0200
Subject: [PATCH] [clangd] Update TidyFastChecks for release/19.x (#106354)

Run for clang-tidy checks available in release/19.x branch.

Some notable findings:
- altera-id-dependent-backward-branch, stays slow with 13%.
- misc-const-correctness become faster, going from 261% to 67%, but
still above
  8% threshold.
- misc-header-include-cycle is a new SLOW check with 10% runtime
implications
- readability-container-size-empty went from 16% to 13%, still SLOW.

(cherry picked from commit b47d7ce8121b1cb1923e879d58eaa1d63aeaaae2)
---
 clang-tools-extra/clangd/TidyFastChecks.inc | 669 +++-
 1 file changed, 367 insertions(+), 302 deletions(-)

diff --git a/clang-tools-extra/clangd/TidyFastChecks.inc 
b/clang-tools-extra/clangd/TidyFastChecks.inc
index 9050ce16127ff4..de1a025602fa9c 100644
--- a/clang-tools-extra/clangd/TidyFastChecks.inc
+++ b/clang-tools-extra/clangd/TidyFastChecks.inc
@@ -7,370 +7,435 @@
 #define SLOW(CHECK, DELTA)
 #endif
 
-FAST(abseil-cleanup-ctad, -1.0)
+FAST(abseil-cleanup-ctad, -2.0)
 FAST(abseil-duration-addition, 0.0)
-FAST(abseil-duration-comparison, 1.0)
-FAST(abseil-duration-conversion-cast, 3.0)
-FAST(abseil-duration-division, -0.0)
-FAST(abseil-duration-factory-float, 1.0)
-FAST(abseil-duration-factory-scale, -0.0)
-FAST(abseil-duration-subtraction, 1.0)
-FAST(abseil-duration-unnecessary-conversion, 4.0)
-FAST(abseil-faster-strsplit-delimiter, 2.0)
-FAST(abseil-no-internal-dependencies, -1.0)
-FAST(abseil-no-namespace, -1.0)
-FAST(abseil-redundant-strcat-calls, 2.0)
-FAST(abseil-str-cat-append, 1.0)
-FAST(abseil-string-find-startswith, 1.0)
-FAST(abseil-string-find-str-contains, 1.0)
-FAST(abseil-time-comparison, -0.0)
-FAST(abseil-time-subtraction, 0.0)
+FAST(abseil-duration-comparison, -1.0)
+FAST(abseil-duration-conversion-cast, -1.0)
+FAST(abseil-duration-division, 0.0)
+FAST(abseil-duration-factory-float, 2.0)
+FAST(abseil-duration-factory-scale, 1.0)
+FAST(abseil-duration-subtraction, -1.0)
+FAST(abseil-duration-unnecessary-conversion, -0.0)
+FAST(abseil-faster-strsplit-delimiter, 3.0)
+FAST(abseil-no-internal-dependencies, 1.0)
+FAST(abseil-no-namespace, -0.0)
+FAST(abseil-redundant-strcat-calls, 1.0)
+FAST(abseil-str-cat-append, -0.0)
+FAST(abseil-string-find-startswith, -1.0)
+FAST(abseil-string-find-str-contains, 4.0)
+FAST(abseil-time-comparison, -1.0)
+FAST(abseil-time-subtraction, 1.0)
 FAST(abseil-upgrade-duration-conversions, 2.0)
 SLOW(altera-id-dependent-backward-branch, 13.0)
-FAST(altera-kernel-name-restriction, -1.0)
-FAST(altera-single-work-item-barrier, -1.0)
-FAST(altera-struct-pack-align, -1.0)
+FAST(altera-kernel-name-restriction, 4.0)
+FAST(altera-single-work-item-barrier, 1.0)
+FAST(altera-struct-pack-align, -0.0)
 FAST(altera-unroll-loops, 2.0)
-FAST(android-cloexec-accept, -1.0)
-FAST(android-cloexec-accept4, 3.0)
-FAST(android-cloexec-creat, 0.0)
-FAST(android-cloexec-dup, 3.0)
-FAST(android-cloexec-epoll-create, -2.0)
-FAST(android-cloexec-epoll-create1, -1.0)
-FAST(android-cloexec-fopen, -0.0)
-FAST(android-cloexec-inotify-init, 1.0)
-FAST(android-cloexec-inotify-init1, 2.0)
-FAST(android-cloexec-memfd-create, 2.0)
-FAST(android-cloexec-open, -1.0)
-FAST(android-cloexec-pipe, -1.0)
+FAST(android-cloexec-accept, 0.0)
+FAST(android-cloexec-accept4, 1.0)
+FAST(android-cloexec-creat, 1.0)
+FAST(android-cloexec-dup, 0.0)
+FAST(android-cloexec-epoll-create, 2.0)
+FAST(android-cloexec-epoll-create1, 0.0)
+FAST(android-cloexec-fopen, -1.0)
+FAST(android-cloexec-inotify-init, 2.0)
+FAST(android-cloexec-inotify-init1, -0.0)
+FAST(android-cloexec-memfd-create, -1.0)
+FAST(android-cloexec-open, 1.0)
+FAST(android-cloexec-pipe, -0.0)
 FAST(android-cloexec-pipe2, 0.0)
 FAST(android-cloexec-socket, 1.0)
-FAST(android-comparison-in-temp-failure-retry, 0.0)
-FAST(boost-use-to-string, 1.0)
-FAST(bugprone-argument-comment, 2.0)
+FAST(android-comparison-in-temp-failure-retry, 1.0)
+FAST(boost-use-ranges, 2.0)
+FAST(boost-use-to-string, 2.0)
+FAST(bugprone-argument-comment, 4.0)
 FAST(bugprone-assert-side-effect, 1.0)
-FAST(bugprone-assignment-in-if-condition, -0.0)
-FAST(bugprone-bad-signal-to-kill-thread, -1.0)
+FAST(bugprone-assignment-in-if-condition, 2.0)
+FAST(bugprone-bad-signal-to-kill-thread, 1.0)
 FAST(bugprone-bool-pointer-implicit-conversion, 0.0)
-FAST(bugprone-branch-clone, -0.0)
+FAST(bugprone-branch-clone, 1.0)
+FAST(bugprone-casting-through-void, 1.0)
+FAST(bugprone-chained-comparison, 1.0)
+FAST(bugprone-compare-pointer-to-member-virtual-function, -0.0)
 FAST(bugprone-copy-constructor-init, 1.0)
-FAST(bugprone-dangling-handle, 0.0)
-FAST(bugprone-dynamic-static-initializers, 1.0)
+FAST(bugprone-crtp-constructor-accessibility, 0

[llvm-branch-commits] [clang-tools-extra] release/19.x: [clangd] Update TidyFastChecks for release/19.x (#106354) (PR #106989)

2024-09-02 Thread via llvm-branch-commits

llvmbot wrote:

@HighCommander4 What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/106989
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] release/19.x: [clangd] Update TidyFastChecks for release/19.x (#106354) (PR #106989)

2024-09-02 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clangd

Author: None (llvmbot)


Changes

Backport b47d7ce8121b1cb1923e879d58eaa1d63aeaaae2

Requested by: @kadircet

---

Patch is 30.06 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/106989.diff


1 Files Affected:

- (modified) clang-tools-extra/clangd/TidyFastChecks.inc (+367-302) 


``diff
diff --git a/clang-tools-extra/clangd/TidyFastChecks.inc 
b/clang-tools-extra/clangd/TidyFastChecks.inc
index 9050ce16127ff4..de1a025602fa9c 100644
--- a/clang-tools-extra/clangd/TidyFastChecks.inc
+++ b/clang-tools-extra/clangd/TidyFastChecks.inc
@@ -7,370 +7,435 @@
 #define SLOW(CHECK, DELTA)
 #endif
 
-FAST(abseil-cleanup-ctad, -1.0)
+FAST(abseil-cleanup-ctad, -2.0)
 FAST(abseil-duration-addition, 0.0)
-FAST(abseil-duration-comparison, 1.0)
-FAST(abseil-duration-conversion-cast, 3.0)
-FAST(abseil-duration-division, -0.0)
-FAST(abseil-duration-factory-float, 1.0)
-FAST(abseil-duration-factory-scale, -0.0)
-FAST(abseil-duration-subtraction, 1.0)
-FAST(abseil-duration-unnecessary-conversion, 4.0)
-FAST(abseil-faster-strsplit-delimiter, 2.0)
-FAST(abseil-no-internal-dependencies, -1.0)
-FAST(abseil-no-namespace, -1.0)
-FAST(abseil-redundant-strcat-calls, 2.0)
-FAST(abseil-str-cat-append, 1.0)
-FAST(abseil-string-find-startswith, 1.0)
-FAST(abseil-string-find-str-contains, 1.0)
-FAST(abseil-time-comparison, -0.0)
-FAST(abseil-time-subtraction, 0.0)
+FAST(abseil-duration-comparison, -1.0)
+FAST(abseil-duration-conversion-cast, -1.0)
+FAST(abseil-duration-division, 0.0)
+FAST(abseil-duration-factory-float, 2.0)
+FAST(abseil-duration-factory-scale, 1.0)
+FAST(abseil-duration-subtraction, -1.0)
+FAST(abseil-duration-unnecessary-conversion, -0.0)
+FAST(abseil-faster-strsplit-delimiter, 3.0)
+FAST(abseil-no-internal-dependencies, 1.0)
+FAST(abseil-no-namespace, -0.0)
+FAST(abseil-redundant-strcat-calls, 1.0)
+FAST(abseil-str-cat-append, -0.0)
+FAST(abseil-string-find-startswith, -1.0)
+FAST(abseil-string-find-str-contains, 4.0)
+FAST(abseil-time-comparison, -1.0)
+FAST(abseil-time-subtraction, 1.0)
 FAST(abseil-upgrade-duration-conversions, 2.0)
 SLOW(altera-id-dependent-backward-branch, 13.0)
-FAST(altera-kernel-name-restriction, -1.0)
-FAST(altera-single-work-item-barrier, -1.0)
-FAST(altera-struct-pack-align, -1.0)
+FAST(altera-kernel-name-restriction, 4.0)
+FAST(altera-single-work-item-barrier, 1.0)
+FAST(altera-struct-pack-align, -0.0)
 FAST(altera-unroll-loops, 2.0)
-FAST(android-cloexec-accept, -1.0)
-FAST(android-cloexec-accept4, 3.0)
-FAST(android-cloexec-creat, 0.0)
-FAST(android-cloexec-dup, 3.0)
-FAST(android-cloexec-epoll-create, -2.0)
-FAST(android-cloexec-epoll-create1, -1.0)
-FAST(android-cloexec-fopen, -0.0)
-FAST(android-cloexec-inotify-init, 1.0)
-FAST(android-cloexec-inotify-init1, 2.0)
-FAST(android-cloexec-memfd-create, 2.0)
-FAST(android-cloexec-open, -1.0)
-FAST(android-cloexec-pipe, -1.0)
+FAST(android-cloexec-accept, 0.0)
+FAST(android-cloexec-accept4, 1.0)
+FAST(android-cloexec-creat, 1.0)
+FAST(android-cloexec-dup, 0.0)
+FAST(android-cloexec-epoll-create, 2.0)
+FAST(android-cloexec-epoll-create1, 0.0)
+FAST(android-cloexec-fopen, -1.0)
+FAST(android-cloexec-inotify-init, 2.0)
+FAST(android-cloexec-inotify-init1, -0.0)
+FAST(android-cloexec-memfd-create, -1.0)
+FAST(android-cloexec-open, 1.0)
+FAST(android-cloexec-pipe, -0.0)
 FAST(android-cloexec-pipe2, 0.0)
 FAST(android-cloexec-socket, 1.0)
-FAST(android-comparison-in-temp-failure-retry, 0.0)
-FAST(boost-use-to-string, 1.0)
-FAST(bugprone-argument-comment, 2.0)
+FAST(android-comparison-in-temp-failure-retry, 1.0)
+FAST(boost-use-ranges, 2.0)
+FAST(boost-use-to-string, 2.0)
+FAST(bugprone-argument-comment, 4.0)
 FAST(bugprone-assert-side-effect, 1.0)
-FAST(bugprone-assignment-in-if-condition, -0.0)
-FAST(bugprone-bad-signal-to-kill-thread, -1.0)
+FAST(bugprone-assignment-in-if-condition, 2.0)
+FAST(bugprone-bad-signal-to-kill-thread, 1.0)
 FAST(bugprone-bool-pointer-implicit-conversion, 0.0)
-FAST(bugprone-branch-clone, -0.0)
+FAST(bugprone-branch-clone, 1.0)
+FAST(bugprone-casting-through-void, 1.0)
+FAST(bugprone-chained-comparison, 1.0)
+FAST(bugprone-compare-pointer-to-member-virtual-function, -0.0)
 FAST(bugprone-copy-constructor-init, 1.0)
-FAST(bugprone-dangling-handle, 0.0)
-FAST(bugprone-dynamic-static-initializers, 1.0)
+FAST(bugprone-crtp-constructor-accessibility, 0.0)
+FAST(bugprone-dangling-handle, -0.0)
+FAST(bugprone-dynamic-static-initializers, 0.0)
 FAST(bugprone-easily-swappable-parameters, 2.0)
-FAST(bugprone-exception-escape, 1.0)
-FAST(bugprone-fold-init-type, 2.0)
+FAST(bugprone-empty-catch, 1.0)
+FAST(bugprone-exception-escape, 0.0)
+FAST(bugprone-fold-init-type, 1.0)
 FAST(bugprone-forward-declaration-namespace, 0.0)
-FAST(bugprone-forwarding-reference-overload, -0.0)
-FAST(bugprone-implicit-widening-of-multiplication-result, 3.0)
+FAST(bugprone-forwarding-reference-overload, -1.0)
+FAST(bugprone-implicit-widening-of-multiplicat

[llvm-branch-commits] [clang-tools-extra] release/19.x: [clangd] Update TidyFastChecks for release/19.x (#106354) (PR #106989)

2024-09-02 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-tools-extra

Author: None (llvmbot)


Changes

Backport b47d7ce8121b1cb1923e879d58eaa1d63aeaaae2

Requested by: @kadircet

---

Patch is 30.06 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/106989.diff


1 Files Affected:

- (modified) clang-tools-extra/clangd/TidyFastChecks.inc (+367-302) 


``diff
diff --git a/clang-tools-extra/clangd/TidyFastChecks.inc 
b/clang-tools-extra/clangd/TidyFastChecks.inc
index 9050ce16127ff4..de1a025602fa9c 100644
--- a/clang-tools-extra/clangd/TidyFastChecks.inc
+++ b/clang-tools-extra/clangd/TidyFastChecks.inc
@@ -7,370 +7,435 @@
 #define SLOW(CHECK, DELTA)
 #endif
 
-FAST(abseil-cleanup-ctad, -1.0)
+FAST(abseil-cleanup-ctad, -2.0)
 FAST(abseil-duration-addition, 0.0)
-FAST(abseil-duration-comparison, 1.0)
-FAST(abseil-duration-conversion-cast, 3.0)
-FAST(abseil-duration-division, -0.0)
-FAST(abseil-duration-factory-float, 1.0)
-FAST(abseil-duration-factory-scale, -0.0)
-FAST(abseil-duration-subtraction, 1.0)
-FAST(abseil-duration-unnecessary-conversion, 4.0)
-FAST(abseil-faster-strsplit-delimiter, 2.0)
-FAST(abseil-no-internal-dependencies, -1.0)
-FAST(abseil-no-namespace, -1.0)
-FAST(abseil-redundant-strcat-calls, 2.0)
-FAST(abseil-str-cat-append, 1.0)
-FAST(abseil-string-find-startswith, 1.0)
-FAST(abseil-string-find-str-contains, 1.0)
-FAST(abseil-time-comparison, -0.0)
-FAST(abseil-time-subtraction, 0.0)
+FAST(abseil-duration-comparison, -1.0)
+FAST(abseil-duration-conversion-cast, -1.0)
+FAST(abseil-duration-division, 0.0)
+FAST(abseil-duration-factory-float, 2.0)
+FAST(abseil-duration-factory-scale, 1.0)
+FAST(abseil-duration-subtraction, -1.0)
+FAST(abseil-duration-unnecessary-conversion, -0.0)
+FAST(abseil-faster-strsplit-delimiter, 3.0)
+FAST(abseil-no-internal-dependencies, 1.0)
+FAST(abseil-no-namespace, -0.0)
+FAST(abseil-redundant-strcat-calls, 1.0)
+FAST(abseil-str-cat-append, -0.0)
+FAST(abseil-string-find-startswith, -1.0)
+FAST(abseil-string-find-str-contains, 4.0)
+FAST(abseil-time-comparison, -1.0)
+FAST(abseil-time-subtraction, 1.0)
 FAST(abseil-upgrade-duration-conversions, 2.0)
 SLOW(altera-id-dependent-backward-branch, 13.0)
-FAST(altera-kernel-name-restriction, -1.0)
-FAST(altera-single-work-item-barrier, -1.0)
-FAST(altera-struct-pack-align, -1.0)
+FAST(altera-kernel-name-restriction, 4.0)
+FAST(altera-single-work-item-barrier, 1.0)
+FAST(altera-struct-pack-align, -0.0)
 FAST(altera-unroll-loops, 2.0)
-FAST(android-cloexec-accept, -1.0)
-FAST(android-cloexec-accept4, 3.0)
-FAST(android-cloexec-creat, 0.0)
-FAST(android-cloexec-dup, 3.0)
-FAST(android-cloexec-epoll-create, -2.0)
-FAST(android-cloexec-epoll-create1, -1.0)
-FAST(android-cloexec-fopen, -0.0)
-FAST(android-cloexec-inotify-init, 1.0)
-FAST(android-cloexec-inotify-init1, 2.0)
-FAST(android-cloexec-memfd-create, 2.0)
-FAST(android-cloexec-open, -1.0)
-FAST(android-cloexec-pipe, -1.0)
+FAST(android-cloexec-accept, 0.0)
+FAST(android-cloexec-accept4, 1.0)
+FAST(android-cloexec-creat, 1.0)
+FAST(android-cloexec-dup, 0.0)
+FAST(android-cloexec-epoll-create, 2.0)
+FAST(android-cloexec-epoll-create1, 0.0)
+FAST(android-cloexec-fopen, -1.0)
+FAST(android-cloexec-inotify-init, 2.0)
+FAST(android-cloexec-inotify-init1, -0.0)
+FAST(android-cloexec-memfd-create, -1.0)
+FAST(android-cloexec-open, 1.0)
+FAST(android-cloexec-pipe, -0.0)
 FAST(android-cloexec-pipe2, 0.0)
 FAST(android-cloexec-socket, 1.0)
-FAST(android-comparison-in-temp-failure-retry, 0.0)
-FAST(boost-use-to-string, 1.0)
-FAST(bugprone-argument-comment, 2.0)
+FAST(android-comparison-in-temp-failure-retry, 1.0)
+FAST(boost-use-ranges, 2.0)
+FAST(boost-use-to-string, 2.0)
+FAST(bugprone-argument-comment, 4.0)
 FAST(bugprone-assert-side-effect, 1.0)
-FAST(bugprone-assignment-in-if-condition, -0.0)
-FAST(bugprone-bad-signal-to-kill-thread, -1.0)
+FAST(bugprone-assignment-in-if-condition, 2.0)
+FAST(bugprone-bad-signal-to-kill-thread, 1.0)
 FAST(bugprone-bool-pointer-implicit-conversion, 0.0)
-FAST(bugprone-branch-clone, -0.0)
+FAST(bugprone-branch-clone, 1.0)
+FAST(bugprone-casting-through-void, 1.0)
+FAST(bugprone-chained-comparison, 1.0)
+FAST(bugprone-compare-pointer-to-member-virtual-function, -0.0)
 FAST(bugprone-copy-constructor-init, 1.0)
-FAST(bugprone-dangling-handle, 0.0)
-FAST(bugprone-dynamic-static-initializers, 1.0)
+FAST(bugprone-crtp-constructor-accessibility, 0.0)
+FAST(bugprone-dangling-handle, -0.0)
+FAST(bugprone-dynamic-static-initializers, 0.0)
 FAST(bugprone-easily-swappable-parameters, 2.0)
-FAST(bugprone-exception-escape, 1.0)
-FAST(bugprone-fold-init-type, 2.0)
+FAST(bugprone-empty-catch, 1.0)
+FAST(bugprone-exception-escape, 0.0)
+FAST(bugprone-fold-init-type, 1.0)
 FAST(bugprone-forward-declaration-namespace, 0.0)
-FAST(bugprone-forwarding-reference-overload, -0.0)
-FAST(bugprone-implicit-widening-of-multiplication-result, 3.0)
+FAST(bugprone-forwarding-reference-overload, -1.0)
+FAST(bugprone-implicit-widening-of-

[llvm-branch-commits] [compiler-rt] release/19.x: [builtins] Fix divtc3.c etc. compilation on Solaris/SPARC with gcc (#101662) (PR #101847)

2024-09-02 Thread Rainer Orth via llvm-branch-commits

rorth wrote:

It's difficult: on one hand it fixes a Solaris/SPARC build failure.  On the 
other, it's said to cause problems for an out-of-tree z/OS port.  
Unfortunately, the developers refuse to publish their code, so it's almost 
impossible to reason about that code.

https://github.com/llvm/llvm-project/pull/101847
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AVR] Fix 16-bit LDDs with immediate overflows (#104923) (PR #106993)

2024-09-02 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/106993
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AVR] Fix 16-bit LDDs with immediate overflows (#104923) (PR #106993)

2024-09-02 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/106993

Backport c7a4efa

Requested by: @EugeneZelenko

>From 2e14b75913fb4e90c7b833b351aee99dc9b0bdbd Mon Sep 17 00:00:00 2001
From: Patryk Wychowaniec 
Date: Thu, 29 Aug 2024 09:28:17 +0200
Subject: [PATCH] [AVR] Fix 16-bit LDDs with immediate overflows (#104923)

16-bit loads are expanded into a pair of 8-bit loads, so the maximum
offset of such 16-bit loads must be 62, not 63.

(cherry picked from commit c7a4efa4294789b1116f0c4a320c16fcb27cb62c)
---
 llvm/lib/Target/AVR/AVRISelDAGToDAG.cpp   |   9 +-
 .../CodeGen/AVR/ldd-immediate-overflow.ll | 144 ++
 .../CodeGen/AVR/std-immediate-overflow.ll | 137 +
 3 files changed, 288 insertions(+), 2 deletions(-)
 create mode 100644 llvm/test/CodeGen/AVR/ldd-immediate-overflow.ll
 create mode 100644 llvm/test/CodeGen/AVR/std-immediate-overflow.ll

diff --git a/llvm/lib/Target/AVR/AVRISelDAGToDAG.cpp 
b/llvm/lib/Target/AVR/AVRISelDAGToDAG.cpp
index 77db876d47e446..a8927d834630ea 100644
--- a/llvm/lib/Target/AVR/AVRISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AVR/AVRISelDAGToDAG.cpp
@@ -122,8 +122,13 @@ bool AVRDAGToDAGISel::SelectAddr(SDNode *Op, SDValue N, 
SDValue &Base,
 // offset allowed.
 MVT VT = cast(Op)->getMemoryVT().getSimpleVT();
 
-// We only accept offsets that fit in 6 bits (unsigned).
-if (isUInt<6>(RHSC) && (VT == MVT::i8 || VT == MVT::i16)) {
+// We only accept offsets that fit in 6 bits (unsigned), with the exception
+// of 16-bit loads - those can only go up to 62, because we desugar them
+// into a pair of 8-bit loads like `ldd rx, RHSC` + `ldd ry, RHSC + 1`.
+bool OkI8 = VT == MVT::i8 && RHSC <= 63;
+bool OkI16 = VT == MVT::i16 && RHSC <= 62;
+
+if (OkI8 || OkI16) {
   Base = N.getOperand(0);
   Disp = CurDAG->getTargetConstant(RHSC, dl, MVT::i8);
 
diff --git a/llvm/test/CodeGen/AVR/ldd-immediate-overflow.ll 
b/llvm/test/CodeGen/AVR/ldd-immediate-overflow.ll
new file mode 100644
index 00..6f1a4b32bb054c
--- /dev/null
+++ b/llvm/test/CodeGen/AVR/ldd-immediate-overflow.ll
@@ -0,0 +1,144 @@
+; RUN: llc -march=avr -filetype=asm -O1 < %s | FileCheck %s
+
+define void @check60(ptr %1) {
+; CHECK-LABEL: check60:
+; CHECK-NEXT: %bb.0
+; CHECK-NEXT: mov r30, r24
+; CHECK-NEXT: mov r31, r25
+; CHECK-NEXT: ldd r24, Z+60
+; CHECK-NEXT: ldd r25, Z+61
+; CHECK-NEXT: ldd r18, Z+62
+; CHECK-NEXT: ldd r19, Z+63
+; CHECK-NEXT: sts 3, r19
+; CHECK-NEXT: sts 2, r18
+; CHECK-NEXT: sts 1, r25
+; CHECK-NEXT: sts 0, r24
+; CHECK-NEXT: ret
+
+bb0:
+  %2 = getelementptr i8, ptr %1, i16 60
+  %3 = load i32, ptr %2, align 1
+  store i32 %3, ptr null, align 1
+  ret void
+}
+
+define void @check61(ptr %1) {
+; CHECK-LABEL: check61:
+; CHECK-NEXT: %bb.0
+; CHECK-NEXT: mov r30, r24
+; CHECK-NEXT: mov r31, r25
+; CHECK-NEXT: ldd r18, Z+61
+; CHECK-NEXT: ldd r19, Z+62
+; CHECK-NEXT: adiw r24, 63
+; CHECK-NEXT: mov r30, r24
+; CHECK-NEXT: mov r31, r25
+; CHECK-NEXT: ld r24, Z
+; CHECK-NEXT: ldd r25, Z+1
+; CHECK-NEXT: sts 3, r25
+; CHECK-NEXT: sts 2, r24
+; CHECK-NEXT: sts 1, r19
+; CHECK-NEXT: sts 0, r18
+; CHECK-NEXT: ret
+
+bb0:
+  %2 = getelementptr i8, ptr %1, i16 61
+  %3 = load i32, ptr %2, align 1
+  store i32 %3, ptr null, align 1
+  ret void
+}
+
+define void @check62(ptr %1) {
+; CHECK-LABEL: check62:
+; CHECK-NEXT: %bb.0
+; CHECK-NEXT: mov r30, r24
+; CHECK-NEXT: mov r31, r25
+; CHECK-NEXT: ldd r18, Z+62
+; CHECK-NEXT: ldd r19, Z+63
+; CHECK-NEXT: adiw r24, 62
+; CHECK-NEXT: mov r30, r24
+; CHECK-NEXT: mov r31, r25
+; CHECK-NEXT: ldd r24, Z+2
+; CHECK-NEXT: ldd r25, Z+3
+; CHECK-NEXT: sts 3, r25
+; CHECK-NEXT: sts 2, r24
+; CHECK-NEXT: sts 1, r19
+; CHECK-NEXT: sts 0, r18
+; CHECK-NEXT: ret
+
+bb0:
+  %2 = getelementptr i8, ptr %1, i16 62
+  %3 = load i32, ptr %2, align 1
+  store i32 %3, ptr null, align 1
+  ret void
+}
+
+define void @check63(ptr %1) {
+; CHECK-LABEL: check63:
+; CHECK-NEXT: %bb.0
+; CHECK-NEXT: adiw r24, 63
+; CHECK-NEXT: mov r30, r24
+; CHECK-NEXT: mov r31, r25
+; CHECK-NEXT: ld r24, Z
+; CHECK-NEXT: ldd r25, Z+1
+; CHECK-NEXT: ldd r18, Z+2
+; CHECK-NEXT: ldd r19, Z+3
+; CHECK-NEXT: sts 3, r19
+; CHECK-NEXT: sts 2, r18
+; CHECK-NEXT: sts 1, r25
+; CHECK-NEXT: sts 0, r24
+; CHECK-NEXT: ret
+
+bb0:
+  %2 = getelementptr i8, ptr %1, i16 63
+  %3 = load i32, ptr %2, align 1
+  store i32 %3, ptr null, align 1
+  ret void
+}
+
+define void @check64(ptr %1) {
+; CHECK-LABEL: check64:
+; CHECK-NEXT: %bb.0
+; CHECK-NEXT: subi r24, 192
+; CHECK-NEXT: sbci r25, 255
+; CHECK-NEXT: mov r30, r24
+; CHECK-NEXT: mov r31, r25
+; CHECK-NEXT: ld r24, Z
+; CHECK-NEXT: ldd r25, Z+1
+; CHECK-NEXT: ldd r18, Z+2
+; CHECK-NEXT: ldd r19, Z+3
+; CHECK-NEXT: sts 3, r19
+; CHECK-NEXT: sts 2, r18
+; CHECK-NEXT: sts 1, r25
+; CHECK-NEXT: sts 0, r24
+; CHECK-NEXT: ret
+
+bb0:
+  %2 = getelementptr i8, ptr %1, i16 64
+  %3 = load i32, ptr %2, align 1
+  store i32 %3, ptr null, align 1
+  ret void
+}
+

[llvm-branch-commits] [clang-tools-extra] release/19.x: [clangd] Update TidyFastChecks for release/19.x (#106354) (PR #106989)

2024-09-02 Thread Nathan Ridge via llvm-branch-commits

https://github.com/HighCommander4 approved this pull request.

+1 from me

https://github.com/llvm/llvm-project/pull/106989
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [MIPS] Optimize sortRelocs for o32 (PR #106008)

2024-09-02 Thread Alex Rønne Petersen via llvm-branch-commits

alexrp wrote:

@MaskRay @topperc @wzssyqa @yingopq sorry for the pings, but I assume today is 
the last chance to get this in, so I would love to hear your thoughts on 
whether you think that's a good idea. :slightly_smiling_face:

https://github.com/llvm/llvm-project/pull/106008
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)

2024-09-02 Thread Ilya Biryukov via llvm-branch-commits

ilya-biryukov wrote:

Thanks for fixing the problem.

@alexfh and another person who was running these investigations before is on 
vacation until next week. I will ask if someone else can do this for them, but 
I wouldn't be surprised that it's involved enough that we may need to wait 
until next week.

Sorry about the long waiting times, but I still wanted to share so that folks 
are aware of the timelines.

https://github.com/llvm/llvm-project/pull/83237
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [MIPS] Optimize sortRelocs for o32 (PR #106008)

2024-09-02 Thread YunQiang Su via llvm-branch-commits

wzssyqa wrote:

I don't think that it is a bugfix, thus not need to be backported.

https://github.com/llvm/llvm-project/pull/106008
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)

2024-09-02 Thread Shilei Tian via llvm-branch-commits


@@ -0,0 +1,17 @@
+; RUN: llc -verify-machineinstrs < %s -relocation-model=pic
+
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; Passing a pointer to thread-local storage to a function can be problematic
+; since computing such addresses requires a function call that is introduced
+; very late in instruction selection. We need to ensure that we don't introduce
+; nested call sequence markers if this function call happens in a call 
sequence.
+
+@TLS = internal thread_local global i64 zeroinitializer, align 8
+declare void @bar(ptr)
+define internal void @foo() {
+call void @bar(ptr @TLS)
+call void @bar(ptr @TLS)
+ret void
+}

shiltian wrote:

add an empty line at the end of file

https://github.com/llvm/llvm-project/pull/106965
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)

2024-09-02 Thread Phoebe Wang via llvm-branch-commits


@@ -0,0 +1,17 @@
+; RUN: llc -verify-machineinstrs < %s -relocation-model=pic
+
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"

phoebewang wrote:

This can put in the RUN line like
`; RUN: llc -mtriple=x86_64 -verify-machineinstrs < %s -relocation-model=pic | 
FileCheck %s`

And add FileCheck to show the assembly.

https://github.com/llvm/llvm-project/pull/106965
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)

2024-09-02 Thread Phoebe Wang via llvm-branch-commits


@@ -0,0 +1,17 @@
+; RUN: llc -verify-machineinstrs < %s -relocation-model=pic
+
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"

phoebewang wrote:

Do not need this.

https://github.com/llvm/llvm-project/pull/106965
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)

2024-09-02 Thread Phoebe Wang via llvm-branch-commits


@@ -0,0 +1,17 @@
+; RUN: llc -verify-machineinstrs < %s -relocation-model=pic
+
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; Passing a pointer to thread-local storage to a function can be problematic
+; since computing such addresses requires a function call that is introduced
+; very late in instruction selection. We need to ensure that we don't introduce
+; nested call sequence markers if this function call happens in a call 
sequence.
+
+@TLS = internal thread_local global i64 zeroinitializer, align 8
+declare void @bar(ptr)
+define internal void @foo() {
+call void @bar(ptr @TLS)
+call void @bar(ptr @TLS)
+ret void

phoebewang wrote:

Two spaces indentation.

https://github.com/llvm/llvm-project/pull/106965
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/19.x: [builtins] Fix divtc3.c etc. compilation on Solaris/SPARC with gcc (#101662) (PR #101847)

2024-09-02 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

That does sound like it should be acceptable to merge if it's only blocking a 
out-of-tree implementation, since we don't officially support that config in 
that case. There is also the question as if we need to backport this - since if 
the main complaint for it not going into main is because of a external port is 
breaking might not be relevant in this case.

@s-barannikov @zibi2 @perry-ca 

https://github.com/llvm/llvm-project/pull/101847
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [MIPS] Optimize sortRelocs for o32 (PR #106008)

2024-09-02 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

Yeah I tend to agree that this is a seemingly nice to have thing, but it's not 
really qualifying for a bugfix or a regression. 

https://github.com/llvm/llvm-project/pull/106008
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AVR] Fix parsing & emitting relative jumps (#106722) (PR #106729)

2024-09-02 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

You can just close this PR and open a new one with both commits by running the 
cherry-pick comment command again with both sha's listed.

https://github.com/llvm/llvm-project/pull/106729
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] release/19.x: workflows/release-binaries: Enable flang builds on Windows (#101344) (PR #106480)

2024-09-02 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

@tstellar several of the builds fail even after a rebase. Some of them seem 
related (especially the macOS ones, so I won't merge this until you had some 
time to look at it.

https://github.com/llvm/llvm-project/pull/106480
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits