[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (#105671, #106723) (PR #106952)
https://github.com/OCHyams created https://github.com/llvm/llvm-project/pull/106952 Please can we backport 43661a1214353ea1773a711f403f8d1118e9ca0f (and 7ffe67c17c524c2d3056c0721a33c7012dce3061) into the next dot release. Replaces #106691 - this one includes a follow-up fix in 7ffe67c17c524c2d3056c0721a33c7012dce3061. I couldn't create two separate requests because it creates conflicts. >From 70d00400165ee76199a1e4565fdebb6d84d2eec7 Mon Sep 17 00:00:00 2001 From: Orlando Cazalet-Hyams Date: Thu, 29 Aug 2024 14:12:02 +0100 Subject: [PATCH 1/2] [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (#105671) Fix #105571 which demonstrates an end() iterator dereference when performing a non-empty splice to end() from a region that ends at Src::end(). Rather than calling Instruction::adoptDbgRecords from Dest, create a marker (which takes an iterator) and absorbDebugValues onto that. The "absorb" variant doesn't clean up the source marker, which in this case we know is a trailing marker, so we have to do that manually. (cherry picked from commit 43661a1214353ea1773a711f403f8d1118e9ca0f) --- llvm/lib/IR/BasicBlock.cpp | 12 - llvm/unittests/IR/BasicBlockDbgInfoTest.cpp | 54 + 2 files changed, 64 insertions(+), 2 deletions(-) diff --git a/llvm/lib/IR/BasicBlock.cpp b/llvm/lib/IR/BasicBlock.cpp index 0a9498f051cb59..46896d3cdf7d50 100644 --- a/llvm/lib/IR/BasicBlock.cpp +++ b/llvm/lib/IR/BasicBlock.cpp @@ -975,8 +975,16 @@ void BasicBlock::spliceDebugInfoImpl(BasicBlock::iterator Dest, BasicBlock *Src, if (ReadFromTail && Src->getMarker(Last)) { DbgMarker *FromLast = Src->getMarker(Last); if (LastIsEnd) { - Dest->adoptDbgRecords(Src, Last, true); - // adoptDbgRecords will release any trailers. + if (Dest == end()) { +// Abosrb the trailing markers from Src. +assert(FromLast == Src->getTrailingDbgRecords()); +createMarker(Dest)->absorbDebugValues(*FromLast, true); +FromLast->eraseFromParent(); +Src->deleteTrailingDbgRecords(); + } else { +// adoptDbgRecords will release any trailers. +Dest->adoptDbgRecords(Src, Last, true); + } assert(!Src->getTrailingDbgRecords()); } else { // FIXME: can we use adoptDbgRecords here to reduce allocations? diff --git a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp index 835780e63aaf4f..5615a4493d20a1 100644 --- a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp +++ b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp @@ -1525,4 +1525,58 @@ TEST(BasicBlockDbgInfoTest, DbgMoveToEnd) { EXPECT_FALSE(Ret->hasDbgRecords()); } +TEST(BasicBlockDbgInfoTest, CloneTrailingRecordsToEmptyBlock) { + LLVMContext C; + std::unique_ptr M = parseIR(C, R"( +define i16 @foo(i16 %a) !dbg !6 { +entry: + %b = add i16 %a, 0 +#dbg_value(i16 %b, !9, !DIExpression(), !11) + ret i16 0, !dbg !11 +} + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!5} + +!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "debugify", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2) +!1 = !DIFile(filename: "t.ll", directory: "/") +!2 = !{} +!5 = !{i32 2, !"Debug Info Version", i32 3} +!6 = distinct !DISubprogram(name: "foo", linkageName: "foo", scope: null, file: !1, line: 1, type: !7, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !8) +!7 = !DISubroutineType(types: !2) +!8 = !{!9} +!9 = !DILocalVariable(name: "1", scope: !6, file: !1, line: 1, type: !10) +!10 = !DIBasicType(name: "ty16", size: 16, encoding: DW_ATE_unsigned) +!11 = !DILocation(line: 1, column: 1, scope: !6) +)"); + ASSERT_TRUE(M); + + Function *F = M->getFunction("foo"); + BasicBlock &BB = F->getEntryBlock(); + // Start with no trailing records. + ASSERT_FALSE(BB.getTrailingDbgRecords()); + + BasicBlock::iterator Ret = std::prev(BB.end()); + BasicBlock::iterator B = std::prev(Ret); + + // Delete terminator which has debug records: we now get trailing records. + Ret->eraseFromParent(); + EXPECT_TRUE(BB.getTrailingDbgRecords()); + + BasicBlock *NewBB = BasicBlock::Create(C, "NewBB", F); + NewBB->splice(NewBB->end(), &BB, B, BB.end()); + + // The trailing records should've been absorbed into NewBB. + EXPECT_FALSE(BB.getTrailingDbgRecords()); + EXPECT_TRUE(NewBB->getTrailingDbgRecords()); + if (NewBB->getTrailingDbgRecords()) { +EXPECT_EQ( +llvm::range_size(NewBB->getTrailingDbgRecords()->getDbgRecordRange()), +1u); + } + + // Drop the trailing records now, to prevent a cleanup assertion. + NewBB->deleteTrailingDbgRecords(); +} + } // End anonymous namespace. >From a0dadeb3a4bd7332046b9af811646b730eaec9d1 Mon Sep 17 00:00:00 2001 From: Orlando Cazalet-Hyams Date: Fri, 30 Aug 2024 13:44:42 +0100 Subject: [PATCH 2/2] [RemoveDIs] Fix asan-identified leak in
[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (#105671, #106723) (PR #106952)
llvmbot wrote: @llvm/pr-subscribers-llvm-ir Author: Orlando Cazalet-Hyams (OCHyams) Changes Please can we backport 43661a1214353ea1773a711f403f8d1118e9ca0f (and 7ffe67c17c524c2d3056c0721a33c7012dce3061) into the next dot release. Replaces #106691 - this one includes a follow-up fix in 7ffe67c17c524c2d3056c0721a33c7012dce3061. I couldn't create two separate requests because it creates conflicts. --- Full diff: https://github.com/llvm/llvm-project/pull/106952.diff 2 Files Affected: - (modified) llvm/lib/IR/BasicBlock.cpp (+10-2) - (modified) llvm/unittests/IR/BasicBlockDbgInfoTest.cpp (+52) ``diff diff --git a/llvm/lib/IR/BasicBlock.cpp b/llvm/lib/IR/BasicBlock.cpp index 0a9498f051cb59..46896d3cdf7d50 100644 --- a/llvm/lib/IR/BasicBlock.cpp +++ b/llvm/lib/IR/BasicBlock.cpp @@ -975,8 +975,16 @@ void BasicBlock::spliceDebugInfoImpl(BasicBlock::iterator Dest, BasicBlock *Src, if (ReadFromTail && Src->getMarker(Last)) { DbgMarker *FromLast = Src->getMarker(Last); if (LastIsEnd) { - Dest->adoptDbgRecords(Src, Last, true); - // adoptDbgRecords will release any trailers. + if (Dest == end()) { +// Abosrb the trailing markers from Src. +assert(FromLast == Src->getTrailingDbgRecords()); +createMarker(Dest)->absorbDebugValues(*FromLast, true); +FromLast->eraseFromParent(); +Src->deleteTrailingDbgRecords(); + } else { +// adoptDbgRecords will release any trailers. +Dest->adoptDbgRecords(Src, Last, true); + } assert(!Src->getTrailingDbgRecords()); } else { // FIXME: can we use adoptDbgRecords here to reduce allocations? diff --git a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp index 835780e63aaf4f..5ce14d3f6b9cef 100644 --- a/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp +++ b/llvm/unittests/IR/BasicBlockDbgInfoTest.cpp @@ -1525,4 +1525,56 @@ TEST(BasicBlockDbgInfoTest, DbgMoveToEnd) { EXPECT_FALSE(Ret->hasDbgRecords()); } +TEST(BasicBlockDbgInfoTest, CloneTrailingRecordsToEmptyBlock) { + LLVMContext C; + std::unique_ptr M = parseIR(C, R"( +define i16 @foo(i16 %a) !dbg !6 { +entry: + %b = add i16 %a, 0 +#dbg_value(i16 %b, !9, !DIExpression(), !11) + ret i16 0, !dbg !11 +} + +!llvm.dbg.cu = !{!0} +!llvm.module.flags = !{!5} + +!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "debugify", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2) +!1 = !DIFile(filename: "t.ll", directory: "/") +!2 = !{} +!5 = !{i32 2, !"Debug Info Version", i32 3} +!6 = distinct !DISubprogram(name: "foo", linkageName: "foo", scope: null, file: !1, line: 1, type: !7, scopeLine: 1, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !8) +!7 = !DISubroutineType(types: !2) +!8 = !{!9} +!9 = !DILocalVariable(name: "1", scope: !6, file: !1, line: 1, type: !10) +!10 = !DIBasicType(name: "ty16", size: 16, encoding: DW_ATE_unsigned) +!11 = !DILocation(line: 1, column: 1, scope: !6) +)"); + ASSERT_TRUE(M); + + Function *F = M->getFunction("foo"); + BasicBlock &BB = F->getEntryBlock(); + // Start with no trailing records. + ASSERT_FALSE(BB.getTrailingDbgRecords()); + + BasicBlock::iterator Ret = std::prev(BB.end()); + BasicBlock::iterator B = std::prev(Ret); + + // Delete terminator which has debug records: we now get trailing records. + Ret->eraseFromParent(); + EXPECT_TRUE(BB.getTrailingDbgRecords()); + + BasicBlock *NewBB = BasicBlock::Create(C, "NewBB", F); + NewBB->splice(NewBB->end(), &BB, B, BB.end()); + + // The trailing records should've been absorbed into NewBB. + EXPECT_FALSE(BB.getTrailingDbgRecords()); + EXPECT_TRUE(NewBB->getTrailingDbgRecords()); + if (DbgMarker *Trailing = NewBB->getTrailingDbgRecords()) { +EXPECT_EQ(llvm::range_size(Trailing->getDbgRecordRange()), 1u); +// Drop the trailing records now, to prevent a cleanup assertion. +Trailing->eraseFromParent(); +NewBB->deleteTrailingDbgRecords(); + } +} + } // End anonymous namespace. `` https://github.com/llvm/llvm-project/pull/106952 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (#105671) (PR #106691)
OCHyams wrote: Re-opened backport request with the fix too - #106952 https://github.com/llvm/llvm-project/pull/106691 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (#105671) (PR #106691)
https://github.com/OCHyams closed https://github.com/llvm/llvm-project/pull/106691 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [RemoveDIs] Fix spliceDebugInfo splice-to-end edge case (#105671, #106723) (PR #106952)
https://github.com/OCHyams edited https://github.com/llvm/llvm-project/pull/106952 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] c0f53ea - Revert "[RuntimeDyld][Windows] Allocate space for dllimport things. (#102586)"
Author: Alastair Houghton Date: 2024-09-02T10:24:44+01:00 New Revision: c0f53ea70d7886b6504aa787b834b8216a4b3367 URL: https://github.com/llvm/llvm-project/commit/c0f53ea70d7886b6504aa787b834b8216a4b3367 DIFF: https://github.com/llvm/llvm-project/commit/c0f53ea70d7886b6504aa787b834b8216a4b3367.diff LOG: Revert "[RuntimeDyld][Windows] Allocate space for dllimport things. (#102586)" This reverts commit a0a253181e3eb2e7173a37b043b82325c7cddd67. Added: Modified: llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.h llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h Removed: diff --git a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp index 5ac5532705dc49..7eb7da0138c972 100644 --- a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp +++ b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp @@ -690,12 +690,9 @@ unsigned RuntimeDyldImpl::computeSectionStubBufSize(const ObjectFile &Obj, if (!(RelSecI == Section)) continue; -for (const RelocationRef &Reloc : SI->relocations()) { +for (const RelocationRef &Reloc : SI->relocations()) if (relocationNeedsStub(Reloc)) StubBufSize += StubSize; - if (relocationNeedsDLLImportStub(Reloc)) -StubBufSize = sizeAfterAddingDLLImportStub(StubBufSize); -} } // Get section data size and alignment diff --git a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp index 73b37ee0ff3311..25a2d8780fb56c 100644 --- a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp +++ b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.cpp @@ -119,14 +119,4 @@ bool RuntimeDyldCOFF::isCompatibleFile(const object::ObjectFile &Obj) const { return Obj.isCOFF(); } -bool RuntimeDyldCOFF::relocationNeedsDLLImportStub( -const RelocationRef &R) const { - object::symbol_iterator Symbol = R.getSymbol(); - Expected TargetNameOrErr = Symbol->getName(); - if (!TargetNameOrErr) -return false; - - return TargetNameOrErr->starts_with(getImportSymbolPrefix()); -} - } // namespace llvm diff --git a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.h b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.h index 51d177c7bb8bec..25e3783cf160b2 100644 --- a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.h +++ b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldCOFF.h @@ -14,7 +14,6 @@ #define LLVM_RUNTIME_DYLD_COFF_H #include "RuntimeDyldImpl.h" -#include "llvm/Support/MathExtras.h" namespace llvm { @@ -46,12 +45,6 @@ class RuntimeDyldCOFF : public RuntimeDyldImpl { static constexpr StringRef getImportSymbolPrefix() { return "__imp_"; } - bool relocationNeedsDLLImportStub(const RelocationRef &R) const; - - unsigned sizeAfterAddingDLLImportStub(unsigned Size) const { -return alignTo(Size, PointerSize) + PointerSize; - } - private: unsigned PointerSize; uint32_t PointerReloc; diff --git a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h index de7630b9747ea4..e09c632842d6e9 100644 --- a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h +++ b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h @@ -455,16 +455,6 @@ class RuntimeDyldImpl { return true;// Conservative answer } - // Return true if the relocation R may require allocating a DLL import stub. - virtual bool relocationNeedsDLLImportStub(const RelocationRef &R) const { -return false; - } - - // Add the size of a DLL import stub to the buffer size - virtual unsigned sizeAfterAddingDLLImportStub(unsigned Size) const { -return Size; - } - public: RuntimeDyldImpl(RuntimeDyld::MemoryManager &MemMgr, JITSymbolResolver &Resolver) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [LoongArch] Add TTI support for cpop with LSX (PR #106961)
https://github.com/wangleiat created https://github.com/llvm/llvm-project/pull/106961 None ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [LoongArch] Add TTI support for cpop with LSX (PR #106961)
llvmbot wrote: @llvm/pr-subscribers-backend-loongarch Author: wanglei (wangleiat) Changes --- Full diff: https://github.com/llvm/llvm-project/pull/106961.diff 4 Files Affected: - (modified) llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp (+7) - (modified) llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.h (+1) - (added) llvm/test/Transforms/LoopIdiom/LoongArch/lit.local.cfg (+2) - (added) llvm/test/Transforms/LoopIdiom/LoongArch/popcnt.ll (+320) ``diff diff --git a/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp b/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp index 2c7b0bfeaaad52..3b227fd7e4345c 100644 --- a/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp +++ b/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp @@ -83,4 +83,11 @@ const char *LoongArchTTIImpl::getRegisterClassName(unsigned ClassID) const { llvm_unreachable("unknown register class"); } +TargetTransformInfo::PopcntSupportKind +LoongArchTTIImpl::getPopcntSupport(unsigned TyWidth) { + assert(isPowerOf2_32(TyWidth) && "Ty width must be power of 2"); + llvm::errs() << "XXX: " << TyWidth << "\n"; + return ST->hasExtLSX() ? TTI::PSK_FastHardware : TTI::PSK_Software; +} + // TODO: Implement more hooks to provide TTI machinery for LoongArch. diff --git a/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.h b/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.h index b2eef80dd9d3d2..f7ce75173be203 100644 --- a/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.h +++ b/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.h @@ -45,6 +45,7 @@ class LoongArchTTIImpl : public BasicTTIImplBase { unsigned getRegisterClassForType(bool Vector, Type *Ty = nullptr) const; unsigned getMaxInterleaveFactor(ElementCount VF); const char *getRegisterClassName(unsigned ClassID) const; + TTI::PopcntSupportKind getPopcntSupport(unsigned TyWidth); // TODO: Implement more hooks to provide TTI machinery for LoongArch. }; diff --git a/llvm/test/Transforms/LoopIdiom/LoongArch/lit.local.cfg b/llvm/test/Transforms/LoopIdiom/LoongArch/lit.local.cfg new file mode 100644 index 00..cc24278acbb414 --- /dev/null +++ b/llvm/test/Transforms/LoopIdiom/LoongArch/lit.local.cfg @@ -0,0 +1,2 @@ +if not "LoongArch" in config.root.targets: +config.unsupported = True diff --git a/llvm/test/Transforms/LoopIdiom/LoongArch/popcnt.ll b/llvm/test/Transforms/LoopIdiom/LoongArch/popcnt.ll new file mode 100644 index 00..7d0fd2ebee3e8d --- /dev/null +++ b/llvm/test/Transforms/LoopIdiom/LoongArch/popcnt.ll @@ -0,0 +1,320 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt -passes=loop-idiom -mtriple=loongarch32 -mattr=+lsx -S < %s | FileCheck %s --check-prefixes=CPOP +; RUN: opt -passes=loop-idiom -mtriple=loongarch64 -mattr=+lsx -S < %s | FileCheck %s --check-prefixes=CPOP +; RUN: opt -passes=loop-idiom -mtriple=loongarch32 -S < %s | FileCheck %s --check-prefixes=NOCPOP +; RUN: opt -passes=loop-idiom -mtriple=loongarch64 -S < %s | FileCheck %s --check-prefixes=NOCPOP + +; Mostly copied from RISCV version. + +;To recognize this pattern: +;int popcount(unsigned long long a) { +;int c = 0; +;while (a) { +;c++; +;a &= a - 1; +;} +;return c; +;} +; + +define i32 @popcount_i64(i64 %a) nounwind uwtable readnone ssp { +; CPOP-LABEL: @popcount_i64( +; CPOP-NEXT: entry: +; CPOP-NEXT:[[TMP0:%.*]] = call i64 @llvm.ctpop.i64(i64 [[A:%.*]]) +; CPOP-NEXT:[[TMP1:%.*]] = trunc i64 [[TMP0]] to i32 +; CPOP-NEXT:[[TMP2:%.*]] = icmp eq i32 [[TMP1]], 0 +; CPOP-NEXT:br i1 [[TMP2]], label [[WHILE_END:%.*]], label [[WHILE_BODY_PREHEADER:%.*]] +; CPOP: while.body.preheader: +; CPOP-NEXT:br label [[WHILE_BODY:%.*]] +; CPOP: while.body: +; CPOP-NEXT:[[TCPHI:%.*]] = phi i32 [ [[TMP1]], [[WHILE_BODY_PREHEADER]] ], [ [[TCDEC:%.*]], [[WHILE_BODY]] ] +; CPOP-NEXT:[[C_05:%.*]] = phi i32 [ [[INC:%.*]], [[WHILE_BODY]] ], [ 0, [[WHILE_BODY_PREHEADER]] ] +; CPOP-NEXT:[[A_ADDR_04:%.*]] = phi i64 [ [[AND:%.*]], [[WHILE_BODY]] ], [ [[A]], [[WHILE_BODY_PREHEADER]] ] +; CPOP-NEXT:[[INC]] = add nsw i32 [[C_05]], 1 +; CPOP-NEXT:[[SUB:%.*]] = add i64 [[A_ADDR_04]], -1 +; CPOP-NEXT:[[AND]] = and i64 [[SUB]], [[A_ADDR_04]] +; CPOP-NEXT:[[TCDEC]] = sub nsw i32 [[TCPHI]], 1 +; CPOP-NEXT:[[TOBOOL:%.*]] = icmp sle i32 [[TCDEC]], 0 +; CPOP-NEXT:br i1 [[TOBOOL]], label [[WHILE_END_LOOPEXIT:%.*]], label [[WHILE_BODY]] +; CPOP: while.end.loopexit: +; CPOP-NEXT:[[INC_LCSSA:%.*]] = phi i32 [ [[TMP1]], [[WHILE_BODY]] ] +; CPOP-NEXT:br label [[WHILE_END]] +; CPOP: while.end: +; CPOP-NEXT:[[C_0_LCSSA:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[INC_LCSSA]], [[WHILE_END_LOOPEXIT]] ] +; CPOP-NEXT:ret i32 [[C_0_LCSSA]] +; +; NOCPOP-LABEL: @popcount_i64( +; NOCPOP-NEXT: entry: +; NOCPOP-NEXT:[[TOBOOL3:%.*]] = icmp eq i64 [[A:%.*]],
[llvm-branch-commits] [llvm] [LoongArch] Add TTI support for cpop with LSX (PR #106961)
https://github.com/wangleiat updated https://github.com/llvm/llvm-project/pull/106961 >From 456935df7a65147dce6fbb8da8e60094ed647161 Mon Sep 17 00:00:00 2001 From: wanglei Date: Mon, 2 Sep 2024 17:59:38 +0800 Subject: [PATCH] remove debug msg Created using spr 1.3.5-bogner --- llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp | 1 - 1 file changed, 1 deletion(-) diff --git a/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp b/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp index 3b227fd7e4345c..5fbc7c734168d1 100644 --- a/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp +++ b/llvm/lib/Target/LoongArch/LoongArchTargetTransformInfo.cpp @@ -86,7 +86,6 @@ const char *LoongArchTTIImpl::getRegisterClassName(unsigned ClassID) const { TargetTransformInfo::PopcntSupportKind LoongArchTTIImpl::getPopcntSupport(unsigned TyWidth) { assert(isPowerOf2_32(TyWidth) && "Ty width must be power of 2"); - llvm::errs() << "XXX: " << TyWidth << "\n"; return ST->hasExtLSX() ? TTI::PSK_FastHardware : TTI::PSK_Software; } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)
https://github.com/ritter-x2a created https://github.com/llvm/llvm-project/pull/106965 When a pointer to thread-local storage is passed in a function call, ISel first lowers the call and wraps the resulting code in CALLSEQ markers. Afterwards, to compute the pointer to TLS, a call to retrieve the TLS base address is generated and then wrapped in a set of CALLSEQ markers. If the latter call is inserted into the call sequence of the former call, this leads to nested call frames, which are illegal and lead to errors in the machine verifier. This patch avoids surrounding the call to compute the TLS base address in CALLSEQ markers if it is already surrounded by such markers. It relies on zero-sized call frames being represented in the call frame size info stored in the MachineBBs. Fixes #45574 and #98042. >From 7159933bbf635490b2c4b9daea99d33373b6c2de Mon Sep 17 00:00:00 2001 From: Fabian Ritter Date: Mon, 2 Sep 2024 05:37:33 -0400 Subject: [PATCH] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments When a pointer to thread-local storage is passed in a function call, ISel first lowers the call and wraps the resulting code in CALLSEQ markers. Afterwards, to compute the pointer to TLS, a call to retrieve the TLS base address is generated and then wrapped in a set of CALLSEQ markers. If the latter call is inserted into the call sequence of the former call, this leads to nested call frames, which are illegal and lead to errors in the machine verifier. This patch avoids surrounding the call to compute the TLS base address in CALLSEQ markers if it is already surrounded by such markers. It relies on zero-sized call frames being represented in the call frame size info stored in the MachineBBs. Fixes #45574 and #98042. --- llvm/lib/Target/X86/X86ISelLowering.cpp| 7 +++ llvm/test/CodeGen/X86/tls-function-argument.ll | 17 + 2 files changed, 24 insertions(+) create mode 100644 llvm/test/CodeGen/X86/tls-function-argument.ll diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp index bbee0af109c74b..bf9777888df831 100644 --- a/llvm/lib/Target/X86/X86ISelLowering.cpp +++ b/llvm/lib/Target/X86/X86ISelLowering.cpp @@ -35593,6 +35593,13 @@ X86TargetLowering::EmitLoweredTLSAddr(MachineInstr &MI, // inside MC, therefore without the two markers shrink-wrapping // may push the prologue/epilogue pass them. const TargetInstrInfo &TII = *Subtarget.getInstrInfo(); + + // Do not introduce CALLSEQ markers if we are already in a call sequence. + // Nested call sequences are not allowed and cause errors in the machine + // verifier. + if (TII.getCallFrameSizeAt(MI).has_value()) +return BB; + const MIMetadata MIMD(MI); MachineFunction &MF = *BB->getParent(); diff --git a/llvm/test/CodeGen/X86/tls-function-argument.ll b/llvm/test/CodeGen/X86/tls-function-argument.ll new file mode 100644 index 00..ec2d664fc6b96f --- /dev/null +++ b/llvm/test/CodeGen/X86/tls-function-argument.ll @@ -0,0 +1,17 @@ +; RUN: llc -verify-machineinstrs < %s -relocation-model=pic + +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-unknown-linux-gnu" + +; Passing a pointer to thread-local storage to a function can be problematic +; since computing such addresses requires a function call that is introduced +; very late in instruction selection. We need to ensure that we don't introduce +; nested call sequence markers if this function call happens in a call sequence. + +@TLS = internal thread_local global i64 zeroinitializer, align 8 +declare void @bar(ptr) +define internal void @foo() { +call void @bar(ptr @TLS) +call void @bar(ptr @TLS) +ret void +} \ No newline at end of file ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)
ritter-x2a wrote: > [!WARNING] > This pull request is not mergeable via GitHub because a downstack PR is > open. Once all requirements are satisfied, merge this PR as a stack href="https://app.graphite.dev/github/pr/llvm/llvm-project/106965?utm_source=stack-comment-downstack-mergeability-warning"; > >on Graphite. > https://graphite.dev/docs/merge-pull-requests";>Learn more * **#106965** https://app.graphite.dev/github/pr/llvm/llvm-project/106965?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> 👈 * **#106964** https://app.graphite.dev/github/pr/llvm/llvm-project/106964?utm_source=stack-comment-icon"; target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="10px" height="10px"/> * `main` This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about stacking. Join @ritter-x2a and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" width="11px" height="11px"/> Graphite https://github.com/llvm/llvm-project/pull/106965 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)
https://github.com/ritter-x2a ready_for_review https://github.com/llvm/llvm-project/pull/106965 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)
llvmbot wrote: @llvm/pr-subscribers-backend-x86 Author: Fabian Ritter (ritter-x2a) Changes When a pointer to thread-local storage is passed in a function call, ISel first lowers the call and wraps the resulting code in CALLSEQ markers. Afterwards, to compute the pointer to TLS, a call to retrieve the TLS base address is generated and then wrapped in a set of CALLSEQ markers. If the latter call is inserted into the call sequence of the former call, this leads to nested call frames, which are illegal and lead to errors in the machine verifier. This patch avoids surrounding the call to compute the TLS base address in CALLSEQ markers if it is already surrounded by such markers. It relies on zero-sized call frames being represented in the call frame size info stored in the MachineBBs. Fixes #45574 and #98042. --- Full diff: https://github.com/llvm/llvm-project/pull/106965.diff 2 Files Affected: - (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+7) - (added) llvm/test/CodeGen/X86/tls-function-argument.ll (+17) ``diff diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp index bbee0af109c74b..bf9777888df831 100644 --- a/llvm/lib/Target/X86/X86ISelLowering.cpp +++ b/llvm/lib/Target/X86/X86ISelLowering.cpp @@ -35593,6 +35593,13 @@ X86TargetLowering::EmitLoweredTLSAddr(MachineInstr &MI, // inside MC, therefore without the two markers shrink-wrapping // may push the prologue/epilogue pass them. const TargetInstrInfo &TII = *Subtarget.getInstrInfo(); + + // Do not introduce CALLSEQ markers if we are already in a call sequence. + // Nested call sequences are not allowed and cause errors in the machine + // verifier. + if (TII.getCallFrameSizeAt(MI).has_value()) +return BB; + const MIMetadata MIMD(MI); MachineFunction &MF = *BB->getParent(); diff --git a/llvm/test/CodeGen/X86/tls-function-argument.ll b/llvm/test/CodeGen/X86/tls-function-argument.ll new file mode 100644 index 00..ec2d664fc6b96f --- /dev/null +++ b/llvm/test/CodeGen/X86/tls-function-argument.ll @@ -0,0 +1,17 @@ +; RUN: llc -verify-machineinstrs < %s -relocation-model=pic + +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-unknown-linux-gnu" + +; Passing a pointer to thread-local storage to a function can be problematic +; since computing such addresses requires a function call that is introduced +; very late in instruction selection. We need to ensure that we don't introduce +; nested call sequence markers if this function call happens in a call sequence. + +@TLS = internal thread_local global i64 zeroinitializer, align 8 +declare void @bar(ptr) +define internal void @foo() { +call void @bar(ptr @TLS) +call void @bar(ptr @TLS) +ret void +} \ No newline at end of file `` https://github.com/llvm/llvm-project/pull/106965 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [AVR] Fix parsing & emitting relative jumps (#106722) (PR #106729)
Patryk27 wrote: Can I somehow help here? Usually I'd cherry-pick changes from https://github.com/llvm/llvm-project/pull/106739 into here myself, but since I'm not the author of the pull request, I can't modify it 👀 (and we'd probably like to avoid having two separate backport pull requests to avoid breaking the branch in the meantime) https://github.com/llvm/llvm-project/pull/106729 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: Restrict LLVM_TARGETS_TO_BUILD in Windows release packaging (#106059) (PR #106546)
zmodem wrote: > @zmodem (or anyone else). If you would like to add a note about this fix in > the release notes (completely optional). Please reply to this comment with a > one or two sentence description of the fix. When you are done, please add the > release:note label to this PR. How about: "Starting with LLVM 19, the Windows installers only include support for the X86, ARM, and AArch64 targets in order to keep the build size within the limits of the NSIS installer framework." https://github.com/llvm/llvm-project/pull/106546 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Fix sign confusion in performMulLoHiCombine (PR #106977)
https://github.com/jayfoad milestoned https://github.com/llvm/llvm-project/pull/106977 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Fix sign confusion in performMulLoHiCombine (PR #106977)
https://github.com/jayfoad created https://github.com/llvm/llvm-project/pull/106977 SMUL_LOHI and UMUL_LOHI are different operations because the high part of the result is different, so it is not OK to optimize the signed version to MUL_U24/MULHI_U24 or the unsigned version to MUL_I24/MULHI_I24. >From 04226baceb4e2823a7ca3daac236f705b3c6c33e Mon Sep 17 00:00:00 2001 From: Jay Foad Date: Tue, 27 Aug 2024 17:09:40 +0100 Subject: [PATCH] [AMDGPU] Fix sign confusion in performMulLoHiCombine (#105831) SMUL_LOHI and UMUL_LOHI are different operations because the high part of the result is different, so it is not OK to optimize the signed version to MUL_U24/MULHI_U24 or the unsigned version to MUL_I24/MULHI_I24. --- llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp | 30 +++--- llvm/test/CodeGen/AMDGPU/mul_int24.ll | 98 +++ 2 files changed, 116 insertions(+), 12 deletions(-) diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp index 39ae7c96cf7729..a71c9453d968dd 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp @@ -4349,6 +4349,7 @@ AMDGPUTargetLowering::performMulLoHiCombine(SDNode *N, SelectionDAG &DAG = DCI.DAG; SDLoc DL(N); + bool Signed = N->getOpcode() == ISD::SMUL_LOHI; SDValue N0 = N->getOperand(0); SDValue N1 = N->getOperand(1); @@ -4363,20 +4364,25 @@ AMDGPUTargetLowering::performMulLoHiCombine(SDNode *N, // Try to use two fast 24-bit multiplies (one for each half of the result) // instead of one slow extending multiply. - unsigned LoOpcode, HiOpcode; - if (Subtarget->hasMulU24() && isU24(N0, DAG) && isU24(N1, DAG)) { -N0 = DAG.getZExtOrTrunc(N0, DL, MVT::i32); -N1 = DAG.getZExtOrTrunc(N1, DL, MVT::i32); -LoOpcode = AMDGPUISD::MUL_U24; -HiOpcode = AMDGPUISD::MULHI_U24; - } else if (Subtarget->hasMulI24() && isI24(N0, DAG) && isI24(N1, DAG)) { -N0 = DAG.getSExtOrTrunc(N0, DL, MVT::i32); -N1 = DAG.getSExtOrTrunc(N1, DL, MVT::i32); -LoOpcode = AMDGPUISD::MUL_I24; -HiOpcode = AMDGPUISD::MULHI_I24; + unsigned LoOpcode = 0; + unsigned HiOpcode = 0; + if (Signed) { +if (Subtarget->hasMulI24() && isI24(N0, DAG) && isI24(N1, DAG)) { + N0 = DAG.getSExtOrTrunc(N0, DL, MVT::i32); + N1 = DAG.getSExtOrTrunc(N1, DL, MVT::i32); + LoOpcode = AMDGPUISD::MUL_I24; + HiOpcode = AMDGPUISD::MULHI_I24; +} } else { -return SDValue(); +if (Subtarget->hasMulU24() && isU24(N0, DAG) && isU24(N1, DAG)) { + N0 = DAG.getZExtOrTrunc(N0, DL, MVT::i32); + N1 = DAG.getZExtOrTrunc(N1, DL, MVT::i32); + LoOpcode = AMDGPUISD::MUL_U24; + HiOpcode = AMDGPUISD::MULHI_U24; +} } + if (!LoOpcode) +return SDValue(); SDValue Lo = DAG.getNode(LoOpcode, DL, MVT::i32, N0, N1); SDValue Hi = DAG.getNode(HiOpcode, DL, MVT::i32, N0, N1); diff --git a/llvm/test/CodeGen/AMDGPU/mul_int24.ll b/llvm/test/CodeGen/AMDGPU/mul_int24.ll index be77a10380c49b..8f4c48fae6fb31 100644 --- a/llvm/test/CodeGen/AMDGPU/mul_int24.ll +++ b/llvm/test/CodeGen/AMDGPU/mul_int24.ll @@ -813,4 +813,102 @@ bb7: ret void } + +define amdgpu_kernel void @test_umul_i24(ptr addrspace(1) %out, i32 %arg) { +; SI-LABEL: test_umul_i24: +; SI: ; %bb.0: +; SI-NEXT:s_load_dword s1, s[2:3], 0xb +; SI-NEXT:v_mov_b32_e32 v0, 0xff803fe1 +; SI-NEXT:s_mov_b32 s0, 0 +; SI-NEXT:s_mov_b32 s3, 0xf000 +; SI-NEXT:s_waitcnt lgkmcnt(0) +; SI-NEXT:s_lshr_b32 s1, s1, 9 +; SI-NEXT:v_mul_hi_u32 v0, s1, v0 +; SI-NEXT:s_mul_i32 s1, s1, 0xff803fe1 +; SI-NEXT:v_alignbit_b32 v0, v0, s1, 1 +; SI-NEXT:s_mov_b32 s2, -1 +; SI-NEXT:s_mov_b32 s1, s0 +; SI-NEXT:buffer_store_dword v0, off, s[0:3], 0 +; SI-NEXT:s_endpgm +; +; VI-LABEL: test_umul_i24: +; VI: ; %bb.0: +; VI-NEXT:s_load_dword s0, s[2:3], 0x2c +; VI-NEXT:v_mov_b32_e32 v0, 0xff803fe1 +; VI-NEXT:s_mov_b32 s3, 0xf000 +; VI-NEXT:s_mov_b32 s2, -1 +; VI-NEXT:s_waitcnt lgkmcnt(0) +; VI-NEXT:s_lshr_b32 s0, s0, 9 +; VI-NEXT:v_mad_u64_u32 v[0:1], s[0:1], s0, v0, 0 +; VI-NEXT:s_mov_b32 s0, 0 +; VI-NEXT:s_mov_b32 s1, s0 +; VI-NEXT:v_alignbit_b32 v0, v1, v0, 1 +; VI-NEXT:s_nop 1 +; VI-NEXT:buffer_store_dword v0, off, s[0:3], 0 +; VI-NEXT:s_endpgm +; +; GFX9-LABEL: test_umul_i24: +; GFX9: ; %bb.0: +; GFX9-NEXT:s_load_dword s1, s[2:3], 0x2c +; GFX9-NEXT:s_mov_b32 s0, 0 +; GFX9-NEXT:s_mov_b32 s3, 0xf000 +; GFX9-NEXT:s_mov_b32 s2, -1 +; GFX9-NEXT:s_waitcnt lgkmcnt(0) +; GFX9-NEXT:s_lshr_b32 s1, s1, 9 +; GFX9-NEXT:s_mul_hi_u32 s4, s1, 0xff803fe1 +; GFX9-NEXT:s_mul_i32 s1, s1, 0xff803fe1 +; GFX9-NEXT:v_mov_b32_e32 v0, s1 +; GFX9-NEXT:v_alignbit_b32 v0, s4, v0, 1 +; GFX9-NEXT:s_mov_b32 s1, s0 +; GFX9-NEXT:buffer_store_dword v0, off, s[0:3], 0 +; GFX9-NEXT:s_endpgm +; +; EG-LABEL: test_umul_i24: +; EG: ; %bb.0: +; EG-
[llvm-branch-commits] [llvm] [AMDGPU] Fix sign confusion in performMulLoHiCombine (PR #106977)
jayfoad wrote: This is a backport of #105831. https://github.com/llvm/llvm-project/pull/106977 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Fix sign confusion in performMulLoHiCombine (PR #106977)
https://github.com/jayfoad edited https://github.com/llvm/llvm-project/pull/106977 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AMDGPU] Fix sign confusion in performMulLoHiCombine (PR #106977)
llvmbot wrote: @llvm/pr-subscribers-backend-amdgpu Author: Jay Foad (jayfoad) Changes SMUL_LOHI and UMUL_LOHI are different operations because the high part of the result is different, so it is not OK to optimize the signed version to MUL_U24/MULHI_U24 or the unsigned version to MUL_I24/MULHI_I24. --- Full diff: https://github.com/llvm/llvm-project/pull/106977.diff 2 Files Affected: - (modified) llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp (+18-12) - (modified) llvm/test/CodeGen/AMDGPU/mul_int24.ll (+98) ``diff diff --git a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp index 39ae7c96cf7729..a71c9453d968dd 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp @@ -4349,6 +4349,7 @@ AMDGPUTargetLowering::performMulLoHiCombine(SDNode *N, SelectionDAG &DAG = DCI.DAG; SDLoc DL(N); + bool Signed = N->getOpcode() == ISD::SMUL_LOHI; SDValue N0 = N->getOperand(0); SDValue N1 = N->getOperand(1); @@ -4363,20 +4364,25 @@ AMDGPUTargetLowering::performMulLoHiCombine(SDNode *N, // Try to use two fast 24-bit multiplies (one for each half of the result) // instead of one slow extending multiply. - unsigned LoOpcode, HiOpcode; - if (Subtarget->hasMulU24() && isU24(N0, DAG) && isU24(N1, DAG)) { -N0 = DAG.getZExtOrTrunc(N0, DL, MVT::i32); -N1 = DAG.getZExtOrTrunc(N1, DL, MVT::i32); -LoOpcode = AMDGPUISD::MUL_U24; -HiOpcode = AMDGPUISD::MULHI_U24; - } else if (Subtarget->hasMulI24() && isI24(N0, DAG) && isI24(N1, DAG)) { -N0 = DAG.getSExtOrTrunc(N0, DL, MVT::i32); -N1 = DAG.getSExtOrTrunc(N1, DL, MVT::i32); -LoOpcode = AMDGPUISD::MUL_I24; -HiOpcode = AMDGPUISD::MULHI_I24; + unsigned LoOpcode = 0; + unsigned HiOpcode = 0; + if (Signed) { +if (Subtarget->hasMulI24() && isI24(N0, DAG) && isI24(N1, DAG)) { + N0 = DAG.getSExtOrTrunc(N0, DL, MVT::i32); + N1 = DAG.getSExtOrTrunc(N1, DL, MVT::i32); + LoOpcode = AMDGPUISD::MUL_I24; + HiOpcode = AMDGPUISD::MULHI_I24; +} } else { -return SDValue(); +if (Subtarget->hasMulU24() && isU24(N0, DAG) && isU24(N1, DAG)) { + N0 = DAG.getZExtOrTrunc(N0, DL, MVT::i32); + N1 = DAG.getZExtOrTrunc(N1, DL, MVT::i32); + LoOpcode = AMDGPUISD::MUL_U24; + HiOpcode = AMDGPUISD::MULHI_U24; +} } + if (!LoOpcode) +return SDValue(); SDValue Lo = DAG.getNode(LoOpcode, DL, MVT::i32, N0, N1); SDValue Hi = DAG.getNode(HiOpcode, DL, MVT::i32, N0, N1); diff --git a/llvm/test/CodeGen/AMDGPU/mul_int24.ll b/llvm/test/CodeGen/AMDGPU/mul_int24.ll index be77a10380c49b..8f4c48fae6fb31 100644 --- a/llvm/test/CodeGen/AMDGPU/mul_int24.ll +++ b/llvm/test/CodeGen/AMDGPU/mul_int24.ll @@ -813,4 +813,102 @@ bb7: ret void } + +define amdgpu_kernel void @test_umul_i24(ptr addrspace(1) %out, i32 %arg) { +; SI-LABEL: test_umul_i24: +; SI: ; %bb.0: +; SI-NEXT:s_load_dword s1, s[2:3], 0xb +; SI-NEXT:v_mov_b32_e32 v0, 0xff803fe1 +; SI-NEXT:s_mov_b32 s0, 0 +; SI-NEXT:s_mov_b32 s3, 0xf000 +; SI-NEXT:s_waitcnt lgkmcnt(0) +; SI-NEXT:s_lshr_b32 s1, s1, 9 +; SI-NEXT:v_mul_hi_u32 v0, s1, v0 +; SI-NEXT:s_mul_i32 s1, s1, 0xff803fe1 +; SI-NEXT:v_alignbit_b32 v0, v0, s1, 1 +; SI-NEXT:s_mov_b32 s2, -1 +; SI-NEXT:s_mov_b32 s1, s0 +; SI-NEXT:buffer_store_dword v0, off, s[0:3], 0 +; SI-NEXT:s_endpgm +; +; VI-LABEL: test_umul_i24: +; VI: ; %bb.0: +; VI-NEXT:s_load_dword s0, s[2:3], 0x2c +; VI-NEXT:v_mov_b32_e32 v0, 0xff803fe1 +; VI-NEXT:s_mov_b32 s3, 0xf000 +; VI-NEXT:s_mov_b32 s2, -1 +; VI-NEXT:s_waitcnt lgkmcnt(0) +; VI-NEXT:s_lshr_b32 s0, s0, 9 +; VI-NEXT:v_mad_u64_u32 v[0:1], s[0:1], s0, v0, 0 +; VI-NEXT:s_mov_b32 s0, 0 +; VI-NEXT:s_mov_b32 s1, s0 +; VI-NEXT:v_alignbit_b32 v0, v1, v0, 1 +; VI-NEXT:s_nop 1 +; VI-NEXT:buffer_store_dword v0, off, s[0:3], 0 +; VI-NEXT:s_endpgm +; +; GFX9-LABEL: test_umul_i24: +; GFX9: ; %bb.0: +; GFX9-NEXT:s_load_dword s1, s[2:3], 0x2c +; GFX9-NEXT:s_mov_b32 s0, 0 +; GFX9-NEXT:s_mov_b32 s3, 0xf000 +; GFX9-NEXT:s_mov_b32 s2, -1 +; GFX9-NEXT:s_waitcnt lgkmcnt(0) +; GFX9-NEXT:s_lshr_b32 s1, s1, 9 +; GFX9-NEXT:s_mul_hi_u32 s4, s1, 0xff803fe1 +; GFX9-NEXT:s_mul_i32 s1, s1, 0xff803fe1 +; GFX9-NEXT:v_mov_b32_e32 v0, s1 +; GFX9-NEXT:v_alignbit_b32 v0, s4, v0, 1 +; GFX9-NEXT:s_mov_b32 s1, s0 +; GFX9-NEXT:buffer_store_dword v0, off, s[0:3], 0 +; GFX9-NEXT:s_endpgm +; +; EG-LABEL: test_umul_i24: +; EG: ; %bb.0: +; EG-NEXT:ALU 8, @4, KC0[CB0:0-32], KC1[] +; EG-NEXT:MEM_RAT_CACHELESS STORE_RAW T0.X, T1.X, 1 +; EG-NEXT:CF_END +; EG-NEXT:PAD +; EG-NEXT:ALU clause starting at 4: +; EG-NEXT: LSHR * T0.W, KC0[2].Z, literal.x, +; EG-NEXT:9(1.261169e-44), 0(0.00e+00) +; EG-NEXT: MULHI * T0.X, PV.W, literal.x, +; EG-NEXT:-8372255(nan), 0(0.00e+0
[compiler-rt] release/19.x: [compiler-rt][fuzzer] SetThreadName build fix for Mingw… (PR #106908)
https://github.com/mstorsjo requested changes to this pull request. Do not backport this. This breaks builds in mingw environments that don't use winpthreads! https://github.com/llvm/llvm-project/pull/106908 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [compiler-rt][fuzzer] SetThreadName build fix for Mingw… (PR #106908)
https://github.com/devnexen closed https://github.com/llvm/llvm-project/pull/106908 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: Win release packaging: Don't try to use rpmalloc for 32-bit x86 (#106969) (PR #106985)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/106985 Backport ef26afcb88dcb5f2de79bfc3cf88a8ea10f230ec Requested by: @zmodem >From ca30c4312e13edb54fa2f4288f7e430854671d19 Mon Sep 17 00:00:00 2001 From: Hans Date: Mon, 2 Sep 2024 15:04:13 +0200 Subject: [PATCH] Win release packaging: Don't try to use rpmalloc for 32-bit x86 (#106969) because that doesn't work (results in `LINK : error LNK2001: unresolved external symbol malloc`). Based on the title of #91862 it was only intended for use in 64-bit builds. (cherry picked from commit ef26afcb88dcb5f2de79bfc3cf88a8ea10f230ec) --- llvm/utils/release/build_llvm_release.bat | 1 + 1 file changed, 1 insertion(+) diff --git a/llvm/utils/release/build_llvm_release.bat b/llvm/utils/release/build_llvm_release.bat index 64ae2d41ab2b02..3508748c1d5404 100755 --- a/llvm/utils/release/build_llvm_release.bat +++ b/llvm/utils/release/build_llvm_release.bat @@ -193,6 +193,7 @@ REM Stage0 binaries directory; used in stage1. set "stage0_bin_dir=%build_dir%/build32_stage0/bin" set cmake_flags=^ %common_cmake_flags% ^ + -DLLVM_ENABLE_RPMALLOC=OFF ^ -DLLDB_TEST_COMPILER=%stage0_bin_dir%/clang.exe ^ -DPYTHON_HOME=%PYTHONHOME% ^ -DPython3_ROOT_DIR=%PYTHONHOME% ^ ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: Win release packaging: Don't try to use rpmalloc for 32-bit x86 (#106969) (PR #106985)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/106985 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: Win release packaging: Don't try to use rpmalloc for 32-bit x86 (#106969) (PR #106985)
llvmbot wrote: @aganea What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/106985 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: Win release packaging: Don't try to use rpmalloc for 32-bit x86 (#106969) (PR #106985)
https://github.com/aganea approved this pull request. https://github.com/llvm/llvm-project/pull/106985 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] release/19.x: [clangd] Update TidyFastChecks for release/19.x (#106354) (PR #106989)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/106989 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] release/19.x: [clangd] Update TidyFastChecks for release/19.x (#106354) (PR #106989)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/106989 Backport b47d7ce8121b1cb1923e879d58eaa1d63aeaaae2 Requested by: @kadircet >From 0de791716d55892ea1872abfec078e4e07bccb19 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?kadir=20=C3=A7etinkaya?= Date: Mon, 2 Sep 2024 15:25:26 +0200 Subject: [PATCH] [clangd] Update TidyFastChecks for release/19.x (#106354) Run for clang-tidy checks available in release/19.x branch. Some notable findings: - altera-id-dependent-backward-branch, stays slow with 13%. - misc-const-correctness become faster, going from 261% to 67%, but still above 8% threshold. - misc-header-include-cycle is a new SLOW check with 10% runtime implications - readability-container-size-empty went from 16% to 13%, still SLOW. (cherry picked from commit b47d7ce8121b1cb1923e879d58eaa1d63aeaaae2) --- clang-tools-extra/clangd/TidyFastChecks.inc | 669 +++- 1 file changed, 367 insertions(+), 302 deletions(-) diff --git a/clang-tools-extra/clangd/TidyFastChecks.inc b/clang-tools-extra/clangd/TidyFastChecks.inc index 9050ce16127ff4..de1a025602fa9c 100644 --- a/clang-tools-extra/clangd/TidyFastChecks.inc +++ b/clang-tools-extra/clangd/TidyFastChecks.inc @@ -7,370 +7,435 @@ #define SLOW(CHECK, DELTA) #endif -FAST(abseil-cleanup-ctad, -1.0) +FAST(abseil-cleanup-ctad, -2.0) FAST(abseil-duration-addition, 0.0) -FAST(abseil-duration-comparison, 1.0) -FAST(abseil-duration-conversion-cast, 3.0) -FAST(abseil-duration-division, -0.0) -FAST(abseil-duration-factory-float, 1.0) -FAST(abseil-duration-factory-scale, -0.0) -FAST(abseil-duration-subtraction, 1.0) -FAST(abseil-duration-unnecessary-conversion, 4.0) -FAST(abseil-faster-strsplit-delimiter, 2.0) -FAST(abseil-no-internal-dependencies, -1.0) -FAST(abseil-no-namespace, -1.0) -FAST(abseil-redundant-strcat-calls, 2.0) -FAST(abseil-str-cat-append, 1.0) -FAST(abseil-string-find-startswith, 1.0) -FAST(abseil-string-find-str-contains, 1.0) -FAST(abseil-time-comparison, -0.0) -FAST(abseil-time-subtraction, 0.0) +FAST(abseil-duration-comparison, -1.0) +FAST(abseil-duration-conversion-cast, -1.0) +FAST(abseil-duration-division, 0.0) +FAST(abseil-duration-factory-float, 2.0) +FAST(abseil-duration-factory-scale, 1.0) +FAST(abseil-duration-subtraction, -1.0) +FAST(abseil-duration-unnecessary-conversion, -0.0) +FAST(abseil-faster-strsplit-delimiter, 3.0) +FAST(abseil-no-internal-dependencies, 1.0) +FAST(abseil-no-namespace, -0.0) +FAST(abseil-redundant-strcat-calls, 1.0) +FAST(abseil-str-cat-append, -0.0) +FAST(abseil-string-find-startswith, -1.0) +FAST(abseil-string-find-str-contains, 4.0) +FAST(abseil-time-comparison, -1.0) +FAST(abseil-time-subtraction, 1.0) FAST(abseil-upgrade-duration-conversions, 2.0) SLOW(altera-id-dependent-backward-branch, 13.0) -FAST(altera-kernel-name-restriction, -1.0) -FAST(altera-single-work-item-barrier, -1.0) -FAST(altera-struct-pack-align, -1.0) +FAST(altera-kernel-name-restriction, 4.0) +FAST(altera-single-work-item-barrier, 1.0) +FAST(altera-struct-pack-align, -0.0) FAST(altera-unroll-loops, 2.0) -FAST(android-cloexec-accept, -1.0) -FAST(android-cloexec-accept4, 3.0) -FAST(android-cloexec-creat, 0.0) -FAST(android-cloexec-dup, 3.0) -FAST(android-cloexec-epoll-create, -2.0) -FAST(android-cloexec-epoll-create1, -1.0) -FAST(android-cloexec-fopen, -0.0) -FAST(android-cloexec-inotify-init, 1.0) -FAST(android-cloexec-inotify-init1, 2.0) -FAST(android-cloexec-memfd-create, 2.0) -FAST(android-cloexec-open, -1.0) -FAST(android-cloexec-pipe, -1.0) +FAST(android-cloexec-accept, 0.0) +FAST(android-cloexec-accept4, 1.0) +FAST(android-cloexec-creat, 1.0) +FAST(android-cloexec-dup, 0.0) +FAST(android-cloexec-epoll-create, 2.0) +FAST(android-cloexec-epoll-create1, 0.0) +FAST(android-cloexec-fopen, -1.0) +FAST(android-cloexec-inotify-init, 2.0) +FAST(android-cloexec-inotify-init1, -0.0) +FAST(android-cloexec-memfd-create, -1.0) +FAST(android-cloexec-open, 1.0) +FAST(android-cloexec-pipe, -0.0) FAST(android-cloexec-pipe2, 0.0) FAST(android-cloexec-socket, 1.0) -FAST(android-comparison-in-temp-failure-retry, 0.0) -FAST(boost-use-to-string, 1.0) -FAST(bugprone-argument-comment, 2.0) +FAST(android-comparison-in-temp-failure-retry, 1.0) +FAST(boost-use-ranges, 2.0) +FAST(boost-use-to-string, 2.0) +FAST(bugprone-argument-comment, 4.0) FAST(bugprone-assert-side-effect, 1.0) -FAST(bugprone-assignment-in-if-condition, -0.0) -FAST(bugprone-bad-signal-to-kill-thread, -1.0) +FAST(bugprone-assignment-in-if-condition, 2.0) +FAST(bugprone-bad-signal-to-kill-thread, 1.0) FAST(bugprone-bool-pointer-implicit-conversion, 0.0) -FAST(bugprone-branch-clone, -0.0) +FAST(bugprone-branch-clone, 1.0) +FAST(bugprone-casting-through-void, 1.0) +FAST(bugprone-chained-comparison, 1.0) +FAST(bugprone-compare-pointer-to-member-virtual-function, -0.0) FAST(bugprone-copy-constructor-init, 1.0) -FAST(bugprone-dangling-handle, 0.0) -FAST(bugprone-dynamic-static-initializers, 1.0) +FAST(bugprone-crtp-constructor-accessibility, 0
[llvm-branch-commits] [clang-tools-extra] release/19.x: [clangd] Update TidyFastChecks for release/19.x (#106354) (PR #106989)
llvmbot wrote: @HighCommander4 What do you think about merging this PR to the release branch? https://github.com/llvm/llvm-project/pull/106989 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] release/19.x: [clangd] Update TidyFastChecks for release/19.x (#106354) (PR #106989)
llvmbot wrote: @llvm/pr-subscribers-clangd Author: None (llvmbot) Changes Backport b47d7ce8121b1cb1923e879d58eaa1d63aeaaae2 Requested by: @kadircet --- Patch is 30.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/106989.diff 1 Files Affected: - (modified) clang-tools-extra/clangd/TidyFastChecks.inc (+367-302) ``diff diff --git a/clang-tools-extra/clangd/TidyFastChecks.inc b/clang-tools-extra/clangd/TidyFastChecks.inc index 9050ce16127ff4..de1a025602fa9c 100644 --- a/clang-tools-extra/clangd/TidyFastChecks.inc +++ b/clang-tools-extra/clangd/TidyFastChecks.inc @@ -7,370 +7,435 @@ #define SLOW(CHECK, DELTA) #endif -FAST(abseil-cleanup-ctad, -1.0) +FAST(abseil-cleanup-ctad, -2.0) FAST(abseil-duration-addition, 0.0) -FAST(abseil-duration-comparison, 1.0) -FAST(abseil-duration-conversion-cast, 3.0) -FAST(abseil-duration-division, -0.0) -FAST(abseil-duration-factory-float, 1.0) -FAST(abseil-duration-factory-scale, -0.0) -FAST(abseil-duration-subtraction, 1.0) -FAST(abseil-duration-unnecessary-conversion, 4.0) -FAST(abseil-faster-strsplit-delimiter, 2.0) -FAST(abseil-no-internal-dependencies, -1.0) -FAST(abseil-no-namespace, -1.0) -FAST(abseil-redundant-strcat-calls, 2.0) -FAST(abseil-str-cat-append, 1.0) -FAST(abseil-string-find-startswith, 1.0) -FAST(abseil-string-find-str-contains, 1.0) -FAST(abseil-time-comparison, -0.0) -FAST(abseil-time-subtraction, 0.0) +FAST(abseil-duration-comparison, -1.0) +FAST(abseil-duration-conversion-cast, -1.0) +FAST(abseil-duration-division, 0.0) +FAST(abseil-duration-factory-float, 2.0) +FAST(abseil-duration-factory-scale, 1.0) +FAST(abseil-duration-subtraction, -1.0) +FAST(abseil-duration-unnecessary-conversion, -0.0) +FAST(abseil-faster-strsplit-delimiter, 3.0) +FAST(abseil-no-internal-dependencies, 1.0) +FAST(abseil-no-namespace, -0.0) +FAST(abseil-redundant-strcat-calls, 1.0) +FAST(abseil-str-cat-append, -0.0) +FAST(abseil-string-find-startswith, -1.0) +FAST(abseil-string-find-str-contains, 4.0) +FAST(abseil-time-comparison, -1.0) +FAST(abseil-time-subtraction, 1.0) FAST(abseil-upgrade-duration-conversions, 2.0) SLOW(altera-id-dependent-backward-branch, 13.0) -FAST(altera-kernel-name-restriction, -1.0) -FAST(altera-single-work-item-barrier, -1.0) -FAST(altera-struct-pack-align, -1.0) +FAST(altera-kernel-name-restriction, 4.0) +FAST(altera-single-work-item-barrier, 1.0) +FAST(altera-struct-pack-align, -0.0) FAST(altera-unroll-loops, 2.0) -FAST(android-cloexec-accept, -1.0) -FAST(android-cloexec-accept4, 3.0) -FAST(android-cloexec-creat, 0.0) -FAST(android-cloexec-dup, 3.0) -FAST(android-cloexec-epoll-create, -2.0) -FAST(android-cloexec-epoll-create1, -1.0) -FAST(android-cloexec-fopen, -0.0) -FAST(android-cloexec-inotify-init, 1.0) -FAST(android-cloexec-inotify-init1, 2.0) -FAST(android-cloexec-memfd-create, 2.0) -FAST(android-cloexec-open, -1.0) -FAST(android-cloexec-pipe, -1.0) +FAST(android-cloexec-accept, 0.0) +FAST(android-cloexec-accept4, 1.0) +FAST(android-cloexec-creat, 1.0) +FAST(android-cloexec-dup, 0.0) +FAST(android-cloexec-epoll-create, 2.0) +FAST(android-cloexec-epoll-create1, 0.0) +FAST(android-cloexec-fopen, -1.0) +FAST(android-cloexec-inotify-init, 2.0) +FAST(android-cloexec-inotify-init1, -0.0) +FAST(android-cloexec-memfd-create, -1.0) +FAST(android-cloexec-open, 1.0) +FAST(android-cloexec-pipe, -0.0) FAST(android-cloexec-pipe2, 0.0) FAST(android-cloexec-socket, 1.0) -FAST(android-comparison-in-temp-failure-retry, 0.0) -FAST(boost-use-to-string, 1.0) -FAST(bugprone-argument-comment, 2.0) +FAST(android-comparison-in-temp-failure-retry, 1.0) +FAST(boost-use-ranges, 2.0) +FAST(boost-use-to-string, 2.0) +FAST(bugprone-argument-comment, 4.0) FAST(bugprone-assert-side-effect, 1.0) -FAST(bugprone-assignment-in-if-condition, -0.0) -FAST(bugprone-bad-signal-to-kill-thread, -1.0) +FAST(bugprone-assignment-in-if-condition, 2.0) +FAST(bugprone-bad-signal-to-kill-thread, 1.0) FAST(bugprone-bool-pointer-implicit-conversion, 0.0) -FAST(bugprone-branch-clone, -0.0) +FAST(bugprone-branch-clone, 1.0) +FAST(bugprone-casting-through-void, 1.0) +FAST(bugprone-chained-comparison, 1.0) +FAST(bugprone-compare-pointer-to-member-virtual-function, -0.0) FAST(bugprone-copy-constructor-init, 1.0) -FAST(bugprone-dangling-handle, 0.0) -FAST(bugprone-dynamic-static-initializers, 1.0) +FAST(bugprone-crtp-constructor-accessibility, 0.0) +FAST(bugprone-dangling-handle, -0.0) +FAST(bugprone-dynamic-static-initializers, 0.0) FAST(bugprone-easily-swappable-parameters, 2.0) -FAST(bugprone-exception-escape, 1.0) -FAST(bugprone-fold-init-type, 2.0) +FAST(bugprone-empty-catch, 1.0) +FAST(bugprone-exception-escape, 0.0) +FAST(bugprone-fold-init-type, 1.0) FAST(bugprone-forward-declaration-namespace, 0.0) -FAST(bugprone-forwarding-reference-overload, -0.0) -FAST(bugprone-implicit-widening-of-multiplication-result, 3.0) +FAST(bugprone-forwarding-reference-overload, -1.0) +FAST(bugprone-implicit-widening-of-multiplicat
[llvm-branch-commits] [clang-tools-extra] release/19.x: [clangd] Update TidyFastChecks for release/19.x (#106354) (PR #106989)
llvmbot wrote: @llvm/pr-subscribers-clang-tools-extra Author: None (llvmbot) Changes Backport b47d7ce8121b1cb1923e879d58eaa1d63aeaaae2 Requested by: @kadircet --- Patch is 30.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/106989.diff 1 Files Affected: - (modified) clang-tools-extra/clangd/TidyFastChecks.inc (+367-302) ``diff diff --git a/clang-tools-extra/clangd/TidyFastChecks.inc b/clang-tools-extra/clangd/TidyFastChecks.inc index 9050ce16127ff4..de1a025602fa9c 100644 --- a/clang-tools-extra/clangd/TidyFastChecks.inc +++ b/clang-tools-extra/clangd/TidyFastChecks.inc @@ -7,370 +7,435 @@ #define SLOW(CHECK, DELTA) #endif -FAST(abseil-cleanup-ctad, -1.0) +FAST(abseil-cleanup-ctad, -2.0) FAST(abseil-duration-addition, 0.0) -FAST(abseil-duration-comparison, 1.0) -FAST(abseil-duration-conversion-cast, 3.0) -FAST(abseil-duration-division, -0.0) -FAST(abseil-duration-factory-float, 1.0) -FAST(abseil-duration-factory-scale, -0.0) -FAST(abseil-duration-subtraction, 1.0) -FAST(abseil-duration-unnecessary-conversion, 4.0) -FAST(abseil-faster-strsplit-delimiter, 2.0) -FAST(abseil-no-internal-dependencies, -1.0) -FAST(abseil-no-namespace, -1.0) -FAST(abseil-redundant-strcat-calls, 2.0) -FAST(abseil-str-cat-append, 1.0) -FAST(abseil-string-find-startswith, 1.0) -FAST(abseil-string-find-str-contains, 1.0) -FAST(abseil-time-comparison, -0.0) -FAST(abseil-time-subtraction, 0.0) +FAST(abseil-duration-comparison, -1.0) +FAST(abseil-duration-conversion-cast, -1.0) +FAST(abseil-duration-division, 0.0) +FAST(abseil-duration-factory-float, 2.0) +FAST(abseil-duration-factory-scale, 1.0) +FAST(abseil-duration-subtraction, -1.0) +FAST(abseil-duration-unnecessary-conversion, -0.0) +FAST(abseil-faster-strsplit-delimiter, 3.0) +FAST(abseil-no-internal-dependencies, 1.0) +FAST(abseil-no-namespace, -0.0) +FAST(abseil-redundant-strcat-calls, 1.0) +FAST(abseil-str-cat-append, -0.0) +FAST(abseil-string-find-startswith, -1.0) +FAST(abseil-string-find-str-contains, 4.0) +FAST(abseil-time-comparison, -1.0) +FAST(abseil-time-subtraction, 1.0) FAST(abseil-upgrade-duration-conversions, 2.0) SLOW(altera-id-dependent-backward-branch, 13.0) -FAST(altera-kernel-name-restriction, -1.0) -FAST(altera-single-work-item-barrier, -1.0) -FAST(altera-struct-pack-align, -1.0) +FAST(altera-kernel-name-restriction, 4.0) +FAST(altera-single-work-item-barrier, 1.0) +FAST(altera-struct-pack-align, -0.0) FAST(altera-unroll-loops, 2.0) -FAST(android-cloexec-accept, -1.0) -FAST(android-cloexec-accept4, 3.0) -FAST(android-cloexec-creat, 0.0) -FAST(android-cloexec-dup, 3.0) -FAST(android-cloexec-epoll-create, -2.0) -FAST(android-cloexec-epoll-create1, -1.0) -FAST(android-cloexec-fopen, -0.0) -FAST(android-cloexec-inotify-init, 1.0) -FAST(android-cloexec-inotify-init1, 2.0) -FAST(android-cloexec-memfd-create, 2.0) -FAST(android-cloexec-open, -1.0) -FAST(android-cloexec-pipe, -1.0) +FAST(android-cloexec-accept, 0.0) +FAST(android-cloexec-accept4, 1.0) +FAST(android-cloexec-creat, 1.0) +FAST(android-cloexec-dup, 0.0) +FAST(android-cloexec-epoll-create, 2.0) +FAST(android-cloexec-epoll-create1, 0.0) +FAST(android-cloexec-fopen, -1.0) +FAST(android-cloexec-inotify-init, 2.0) +FAST(android-cloexec-inotify-init1, -0.0) +FAST(android-cloexec-memfd-create, -1.0) +FAST(android-cloexec-open, 1.0) +FAST(android-cloexec-pipe, -0.0) FAST(android-cloexec-pipe2, 0.0) FAST(android-cloexec-socket, 1.0) -FAST(android-comparison-in-temp-failure-retry, 0.0) -FAST(boost-use-to-string, 1.0) -FAST(bugprone-argument-comment, 2.0) +FAST(android-comparison-in-temp-failure-retry, 1.0) +FAST(boost-use-ranges, 2.0) +FAST(boost-use-to-string, 2.0) +FAST(bugprone-argument-comment, 4.0) FAST(bugprone-assert-side-effect, 1.0) -FAST(bugprone-assignment-in-if-condition, -0.0) -FAST(bugprone-bad-signal-to-kill-thread, -1.0) +FAST(bugprone-assignment-in-if-condition, 2.0) +FAST(bugprone-bad-signal-to-kill-thread, 1.0) FAST(bugprone-bool-pointer-implicit-conversion, 0.0) -FAST(bugprone-branch-clone, -0.0) +FAST(bugprone-branch-clone, 1.0) +FAST(bugprone-casting-through-void, 1.0) +FAST(bugprone-chained-comparison, 1.0) +FAST(bugprone-compare-pointer-to-member-virtual-function, -0.0) FAST(bugprone-copy-constructor-init, 1.0) -FAST(bugprone-dangling-handle, 0.0) -FAST(bugprone-dynamic-static-initializers, 1.0) +FAST(bugprone-crtp-constructor-accessibility, 0.0) +FAST(bugprone-dangling-handle, -0.0) +FAST(bugprone-dynamic-static-initializers, 0.0) FAST(bugprone-easily-swappable-parameters, 2.0) -FAST(bugprone-exception-escape, 1.0) -FAST(bugprone-fold-init-type, 2.0) +FAST(bugprone-empty-catch, 1.0) +FAST(bugprone-exception-escape, 0.0) +FAST(bugprone-fold-init-type, 1.0) FAST(bugprone-forward-declaration-namespace, 0.0) -FAST(bugprone-forwarding-reference-overload, -0.0) -FAST(bugprone-implicit-widening-of-multiplication-result, 3.0) +FAST(bugprone-forwarding-reference-overload, -1.0) +FAST(bugprone-implicit-widening-of-
[llvm-branch-commits] [compiler-rt] release/19.x: [builtins] Fix divtc3.c etc. compilation on Solaris/SPARC with gcc (#101662) (PR #101847)
rorth wrote: It's difficult: on one hand it fixes a Solaris/SPARC build failure. On the other, it's said to cause problems for an out-of-tree z/OS port. Unfortunately, the developers refuse to publish their code, so it's almost impossible to reason about that code. https://github.com/llvm/llvm-project/pull/101847 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [AVR] Fix 16-bit LDDs with immediate overflows (#104923) (PR #106993)
https://github.com/llvmbot milestoned https://github.com/llvm/llvm-project/pull/106993 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [AVR] Fix 16-bit LDDs with immediate overflows (#104923) (PR #106993)
https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/106993 Backport c7a4efa Requested by: @EugeneZelenko >From 2e14b75913fb4e90c7b833b351aee99dc9b0bdbd Mon Sep 17 00:00:00 2001 From: Patryk Wychowaniec Date: Thu, 29 Aug 2024 09:28:17 +0200 Subject: [PATCH] [AVR] Fix 16-bit LDDs with immediate overflows (#104923) 16-bit loads are expanded into a pair of 8-bit loads, so the maximum offset of such 16-bit loads must be 62, not 63. (cherry picked from commit c7a4efa4294789b1116f0c4a320c16fcb27cb62c) --- llvm/lib/Target/AVR/AVRISelDAGToDAG.cpp | 9 +- .../CodeGen/AVR/ldd-immediate-overflow.ll | 144 ++ .../CodeGen/AVR/std-immediate-overflow.ll | 137 + 3 files changed, 288 insertions(+), 2 deletions(-) create mode 100644 llvm/test/CodeGen/AVR/ldd-immediate-overflow.ll create mode 100644 llvm/test/CodeGen/AVR/std-immediate-overflow.ll diff --git a/llvm/lib/Target/AVR/AVRISelDAGToDAG.cpp b/llvm/lib/Target/AVR/AVRISelDAGToDAG.cpp index 77db876d47e446..a8927d834630ea 100644 --- a/llvm/lib/Target/AVR/AVRISelDAGToDAG.cpp +++ b/llvm/lib/Target/AVR/AVRISelDAGToDAG.cpp @@ -122,8 +122,13 @@ bool AVRDAGToDAGISel::SelectAddr(SDNode *Op, SDValue N, SDValue &Base, // offset allowed. MVT VT = cast(Op)->getMemoryVT().getSimpleVT(); -// We only accept offsets that fit in 6 bits (unsigned). -if (isUInt<6>(RHSC) && (VT == MVT::i8 || VT == MVT::i16)) { +// We only accept offsets that fit in 6 bits (unsigned), with the exception +// of 16-bit loads - those can only go up to 62, because we desugar them +// into a pair of 8-bit loads like `ldd rx, RHSC` + `ldd ry, RHSC + 1`. +bool OkI8 = VT == MVT::i8 && RHSC <= 63; +bool OkI16 = VT == MVT::i16 && RHSC <= 62; + +if (OkI8 || OkI16) { Base = N.getOperand(0); Disp = CurDAG->getTargetConstant(RHSC, dl, MVT::i8); diff --git a/llvm/test/CodeGen/AVR/ldd-immediate-overflow.ll b/llvm/test/CodeGen/AVR/ldd-immediate-overflow.ll new file mode 100644 index 00..6f1a4b32bb054c --- /dev/null +++ b/llvm/test/CodeGen/AVR/ldd-immediate-overflow.ll @@ -0,0 +1,144 @@ +; RUN: llc -march=avr -filetype=asm -O1 < %s | FileCheck %s + +define void @check60(ptr %1) { +; CHECK-LABEL: check60: +; CHECK-NEXT: %bb.0 +; CHECK-NEXT: mov r30, r24 +; CHECK-NEXT: mov r31, r25 +; CHECK-NEXT: ldd r24, Z+60 +; CHECK-NEXT: ldd r25, Z+61 +; CHECK-NEXT: ldd r18, Z+62 +; CHECK-NEXT: ldd r19, Z+63 +; CHECK-NEXT: sts 3, r19 +; CHECK-NEXT: sts 2, r18 +; CHECK-NEXT: sts 1, r25 +; CHECK-NEXT: sts 0, r24 +; CHECK-NEXT: ret + +bb0: + %2 = getelementptr i8, ptr %1, i16 60 + %3 = load i32, ptr %2, align 1 + store i32 %3, ptr null, align 1 + ret void +} + +define void @check61(ptr %1) { +; CHECK-LABEL: check61: +; CHECK-NEXT: %bb.0 +; CHECK-NEXT: mov r30, r24 +; CHECK-NEXT: mov r31, r25 +; CHECK-NEXT: ldd r18, Z+61 +; CHECK-NEXT: ldd r19, Z+62 +; CHECK-NEXT: adiw r24, 63 +; CHECK-NEXT: mov r30, r24 +; CHECK-NEXT: mov r31, r25 +; CHECK-NEXT: ld r24, Z +; CHECK-NEXT: ldd r25, Z+1 +; CHECK-NEXT: sts 3, r25 +; CHECK-NEXT: sts 2, r24 +; CHECK-NEXT: sts 1, r19 +; CHECK-NEXT: sts 0, r18 +; CHECK-NEXT: ret + +bb0: + %2 = getelementptr i8, ptr %1, i16 61 + %3 = load i32, ptr %2, align 1 + store i32 %3, ptr null, align 1 + ret void +} + +define void @check62(ptr %1) { +; CHECK-LABEL: check62: +; CHECK-NEXT: %bb.0 +; CHECK-NEXT: mov r30, r24 +; CHECK-NEXT: mov r31, r25 +; CHECK-NEXT: ldd r18, Z+62 +; CHECK-NEXT: ldd r19, Z+63 +; CHECK-NEXT: adiw r24, 62 +; CHECK-NEXT: mov r30, r24 +; CHECK-NEXT: mov r31, r25 +; CHECK-NEXT: ldd r24, Z+2 +; CHECK-NEXT: ldd r25, Z+3 +; CHECK-NEXT: sts 3, r25 +; CHECK-NEXT: sts 2, r24 +; CHECK-NEXT: sts 1, r19 +; CHECK-NEXT: sts 0, r18 +; CHECK-NEXT: ret + +bb0: + %2 = getelementptr i8, ptr %1, i16 62 + %3 = load i32, ptr %2, align 1 + store i32 %3, ptr null, align 1 + ret void +} + +define void @check63(ptr %1) { +; CHECK-LABEL: check63: +; CHECK-NEXT: %bb.0 +; CHECK-NEXT: adiw r24, 63 +; CHECK-NEXT: mov r30, r24 +; CHECK-NEXT: mov r31, r25 +; CHECK-NEXT: ld r24, Z +; CHECK-NEXT: ldd r25, Z+1 +; CHECK-NEXT: ldd r18, Z+2 +; CHECK-NEXT: ldd r19, Z+3 +; CHECK-NEXT: sts 3, r19 +; CHECK-NEXT: sts 2, r18 +; CHECK-NEXT: sts 1, r25 +; CHECK-NEXT: sts 0, r24 +; CHECK-NEXT: ret + +bb0: + %2 = getelementptr i8, ptr %1, i16 63 + %3 = load i32, ptr %2, align 1 + store i32 %3, ptr null, align 1 + ret void +} + +define void @check64(ptr %1) { +; CHECK-LABEL: check64: +; CHECK-NEXT: %bb.0 +; CHECK-NEXT: subi r24, 192 +; CHECK-NEXT: sbci r25, 255 +; CHECK-NEXT: mov r30, r24 +; CHECK-NEXT: mov r31, r25 +; CHECK-NEXT: ld r24, Z +; CHECK-NEXT: ldd r25, Z+1 +; CHECK-NEXT: ldd r18, Z+2 +; CHECK-NEXT: ldd r19, Z+3 +; CHECK-NEXT: sts 3, r19 +; CHECK-NEXT: sts 2, r18 +; CHECK-NEXT: sts 1, r25 +; CHECK-NEXT: sts 0, r24 +; CHECK-NEXT: ret + +bb0: + %2 = getelementptr i8, ptr %1, i16 64 + %3 = load i32, ptr %2, align 1 + store i32 %3, ptr null, align 1 + ret void +} +
[llvm-branch-commits] [clang-tools-extra] release/19.x: [clangd] Update TidyFastChecks for release/19.x (#106354) (PR #106989)
https://github.com/HighCommander4 approved this pull request. +1 from me https://github.com/llvm/llvm-project/pull/106989 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [MIPS] Optimize sortRelocs for o32 (PR #106008)
alexrp wrote: @MaskRay @topperc @wzssyqa @yingopq sorry for the pings, but I assume today is the last chance to get this in, so I would love to hear your thoughts on whether you think that's a good idea. :slightly_smiling_face: https://github.com/llvm/llvm-project/pull/106008 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Serialization] Code cleanups and polish 83233 (PR #83237)
ilya-biryukov wrote: Thanks for fixing the problem. @alexfh and another person who was running these investigations before is on vacation until next week. I will ask if someone else can do this for them, but I wouldn't be surprised that it's involved enough that we may need to wait until next week. Sorry about the long waiting times, but I still wanted to share so that folks are aware of the timelines. https://github.com/llvm/llvm-project/pull/83237 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [MIPS] Optimize sortRelocs for o32 (PR #106008)
wzssyqa wrote: I don't think that it is a bugfix, thus not need to be backported. https://github.com/llvm/llvm-project/pull/106008 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)
@@ -0,0 +1,17 @@ +; RUN: llc -verify-machineinstrs < %s -relocation-model=pic + +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-unknown-linux-gnu" + +; Passing a pointer to thread-local storage to a function can be problematic +; since computing such addresses requires a function call that is introduced +; very late in instruction selection. We need to ensure that we don't introduce +; nested call sequence markers if this function call happens in a call sequence. + +@TLS = internal thread_local global i64 zeroinitializer, align 8 +declare void @bar(ptr) +define internal void @foo() { +call void @bar(ptr @TLS) +call void @bar(ptr @TLS) +ret void +} shiltian wrote: add an empty line at the end of file https://github.com/llvm/llvm-project/pull/106965 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)
@@ -0,0 +1,17 @@ +; RUN: llc -verify-machineinstrs < %s -relocation-model=pic + +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-unknown-linux-gnu" phoebewang wrote: This can put in the RUN line like `; RUN: llc -mtriple=x86_64 -verify-machineinstrs < %s -relocation-model=pic | FileCheck %s` And add FileCheck to show the assembly. https://github.com/llvm/llvm-project/pull/106965 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)
@@ -0,0 +1,17 @@ +; RUN: llc -verify-machineinstrs < %s -relocation-model=pic + +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" phoebewang wrote: Do not need this. https://github.com/llvm/llvm-project/pull/106965 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [X86] Avoid generating nested CALLSEQ for TLS pointer function arguments (PR #106965)
@@ -0,0 +1,17 @@ +; RUN: llc -verify-machineinstrs < %s -relocation-model=pic + +target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" +target triple = "x86_64-unknown-linux-gnu" + +; Passing a pointer to thread-local storage to a function can be problematic +; since computing such addresses requires a function call that is introduced +; very late in instruction selection. We need to ensure that we don't introduce +; nested call sequence markers if this function call happens in a call sequence. + +@TLS = internal thread_local global i64 zeroinitializer, align 8 +declare void @bar(ptr) +define internal void @foo() { +call void @bar(ptr @TLS) +call void @bar(ptr @TLS) +ret void phoebewang wrote: Two spaces indentation. https://github.com/llvm/llvm-project/pull/106965 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] release/19.x: [builtins] Fix divtc3.c etc. compilation on Solaris/SPARC with gcc (#101662) (PR #101847)
tru wrote: That does sound like it should be acceptable to merge if it's only blocking a out-of-tree implementation, since we don't officially support that config in that case. There is also the question as if we need to backport this - since if the main complaint for it not going into main is because of a external port is breaking might not be relevant in this case. @s-barannikov @zibi2 @perry-ca https://github.com/llvm/llvm-project/pull/101847 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [MIPS] Optimize sortRelocs for o32 (PR #106008)
tru wrote: Yeah I tend to agree that this is a seemingly nice to have thing, but it's not really qualifying for a bugfix or a regression. https://github.com/llvm/llvm-project/pull/106008 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/19.x: [AVR] Fix parsing & emitting relative jumps (#106722) (PR #106729)
tru wrote: You can just close this PR and open a new one with both commits by running the cherry-pick comment command again with both sha's listed. https://github.com/llvm/llvm-project/pull/106729 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] release/19.x: workflows/release-binaries: Enable flang builds on Windows (#101344) (PR #106480)
tru wrote: @tstellar several of the builds fail even after a rebase. Some of them seem related (especially the macOS ones, so I won't merge this until you had some time to look at it. https://github.com/llvm/llvm-project/pull/106480 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits