[llvm-branch-commits] [GlobalISel] Don't remove from unfinalized GISelWorkList (PR #102158)
https://github.com/tobias-stadler created https://github.com/llvm/llvm-project/pull/102158 Remove a hack from GISelWorkList caused by the Combiner removing instructions from an unfinalized GISelWorkList during the DCE phase. This is in preparation for larger changes to the WorkListMaintainer. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)
https://github.com/tobias-stadler created https://github.com/llvm/llvm-project/pull/102163 Continues the work for disabling fixed-point iteration in the Combiner (#94291). This introduces improved Observer-based heuristics in the GISel Combiner to retry combining defs/uses of modified instructions and for performing sparse dead code elimination. I have experimented a lot with the heuristics and this seems to be the minimal set of heuristics that allows disabling fixed-point iteration for AArch64 CTMark O2 without regressions. Enabling this globally would pass all regression tests for all official targets (apart from small benign diffs), but I have made this fully opt-in for now, because I can't quantify the impact for other targets. For performance numbers see my follow-up patch for AArch64. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)
https://github.com/tobias-stadler created https://github.com/llvm/llvm-project/pull/102167 Disable fixed-point iteration in all AArch64 Combiners after #102163. See inline comments for justification of test changes. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)
tobias-stadler wrote: CTMark O0: ``` Program compile_instructions size..text base-O0 patch-O0 diff base-O0patch-O0 diff 7zip/7zip-benchmark 141676688007.00 141409424579.00 -0.19% 1004756.00 1004756.00 0.00% Bullet/bullet 61212623785.00 61017628199.00 -0.32% 873356.00 873356.00 0.00% SPASS/SPASS14400540825.00 14348447793.00 -0.36% 645136.00 645136.00 0.00% tramp3d-v4/tramp3d-v4 18450535356.00 18380583241.00 -0.38% 894656.00 894656.00 0.00% kimwitu++/kc 22538775759.00 22426150589.00 -0.50% 813492.00 813492.00 0.00% ClamAV/clamscan14536543546.00 14378298132.00 -1.09% 702832.00 702832.00 0.00% lencod/lencod 12679661465.00 12525031047.00 -1.22% 820564.00 820564.00 0.00% mafft/pairlocalalign6940116013.00 6851372697.00 -1.28% 423000.00 423000.00 0.00% consumer-typeset/consumer-typeset 12193354451.00 12003246725.00 -1.56% 908844.00 908844.00 0.00% sqlite3/sqlite3 4596588654.00 4512822506.00 -1.82% 459904.00 459904.00 0.00% Geomean difference -0.87% 0.00% ``` CTMark O2: ``` Program compile_instructions size..text base-O2 patch-O2 diff base-O2patch-O2 diff 7zip/7zip-benchmark 207008981549.00 205988150072.00 -0.49% 824088.00 824088.00 0.00% Bullet/bullet 99712646588.00 99179976524.00 -0.53% 515864.00 515864.00 0.00% SPASS/SPASS43192888774.00 42858161952.00 -0.77% 443056.00 443056.00 0.00% kimwitu++/kc 41563609655.00 41217223921.00 -0.83% 461284.00 461284.00 0.00% tramp3d-v4/tramp3d-v4 65778761906.00 6520869.00 -0.87% 570560.00 570560.00 0.00% lencod/lencod 57111009799.00 56497616800.00 -1.07% 541960.00 541960.00 0.00% ClamAV/clamscan53935475581.00 53350116553.00 -1.09% 456352.00 456352.00 0.00% mafft/pairlocalalign 32799063008.00 32439200934.00 -1.10% 320888.00 320888.00 0.00% consumer-typeset/consumer-typeset 35357798970.00 34865643450.00 -1.39% 419436.00 419436.00 0.00% sqlite3/sqlite335219936375.00 34678565893.00 -1.54% 436160.00 436160.00 0.00% Geomean difference -0.97% 0.00% ``` https://github.com/llvm/llvm-project/pull/102167 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)
@@ -257,10 +257,10 @@ define i32 @udiv_div_by_180(i32 %x) ; ; GISEL-LABEL: udiv_div_by_180: ; GISEL: // %bb.0: -; GISEL-NEXT:uxtb w8, w0 -; GISEL-NEXT:mov w9, #5826 // =0x16c2 -; GISEL-NEXT:movk w9, #364, lsl #16 -; GISEL-NEXT:umull x8, w8, w9 +; GISEL-NEXT:mov w8, #5826 // =0x16c2 +; GISEL-NEXT:and w9, w0, #0xff +; GISEL-NEXT:movk w8, #364, lsl #16 +; GISEL-NEXT:umull x8, w9, w8 tobias-stadler wrote: expanding the div generates a bunch of instructions and the order in which they are recombined results in different ouput https://github.com/llvm/llvm-project/pull/102167 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)
@@ -9,6 +9,8 @@ body: | bb.1: ; CHECK-LABEL: name: crash_fn ; CHECK: [[C:%[0-9]+]]:_(p0) = G_CONSTANT i64 0 +; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 0 +; CHECK-NEXT: [[C2:%[0-9]+]]:_(s1) = G_CONSTANT i1 false tobias-stadler wrote: These are dead in the input IR, which doesn't happen inside the CodeGen pipeline. https://github.com/llvm/llvm-project/pull/102167 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)
@@ -88,7 +95,8 @@ body: | %cmp2:_(s32) = G_ICMP intpred(sge), %cmp_lhs(s64), %add $w0 = COPY %cmp2(s32) -RET_ReallyLR implicit $w0 +$w1 = COPY %cmp1(s32) tobias-stadler wrote: cmp1 was dead in the input IR, but according to the comment this seems unintended. https://github.com/llvm/llvm-project/pull/102167 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)
@@ -33,8 +33,7 @@ define noundef i1 @logger(i32 noundef %logLevel, ptr %ea, ptr %pll) { ; CHECK-GI-NEXT:b.hi .LBB1_2 ; CHECK-GI-NEXT: // %bb.1: // %land.rhs ; CHECK-GI-NEXT:ldr x8, [x1] -; CHECK-GI-NEXT:ldrb w8, [x8] -; CHECK-GI-NEXT:and w0, w8, #0x1 +; CHECK-GI-NEXT:ldrb w0, [x8] tobias-stadler wrote: The immediate DCEing and better ordering of combines prevents the PreLegalizerCombiner from generating a bunch of useless artifacts that it can't combine away again. These artifacts are converted into G_AND by the ArtifactCombiner, which can't be combined away by the redundant_and combine, because KnownBits can't look through the implicit anyext of the G_LOAD. We are probably missing some combines that convert the load into sext/zext versions. https://github.com/llvm/llvm-project/pull/102167 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)
https://github.com/tobias-stadler edited https://github.com/llvm/llvm-project/pull/102167 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)
https://github.com/tobias-stadler edited https://github.com/llvm/llvm-project/pull/102163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)
https://github.com/tobias-stadler edited https://github.com/llvm/llvm-project/pull/102167 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)
tobias-stadler wrote: > Is this a fundamental issue of the combiner or do we have to revisit the > situation when the combiner becomes slowly more powerful? This is a fundamental issue of combiner-style algorithms. Fixed-point iteration just burns too much compile-time for no good reason. Both the InstCombiner and the DAGCombiner don't use fixed-point iteration anymore for the same compile-time reasons. The heuristics I have implemented should be mostly on-par with how thoroughly the other combiner implementations currently handle retrying combines. The main difference is that we can rely on the Observer to detect changes for retrying combines and the other combiner implementations need to use WorkList-aware functions for performing changes on the IR. If this approach is good enough for SelectionDAG, I don't see this approach becoming a problem for GlobalISel. https://github.com/llvm/llvm-project/pull/102167 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)
https://github.com/tobias-stadler edited https://github.com/llvm/llvm-project/pull/102163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)
https://github.com/tobias-stadler edited https://github.com/llvm/llvm-project/pull/102163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GlobalISel] Don't remove from unfinalized GISelWorkList (PR #102158)
https://github.com/tobias-stadler updated https://github.com/llvm/llvm-project/pull/102158 >From 1f5757a4c3989755623d66c43575c858dcb13f75 Mon Sep 17 00:00:00 2001 From: Tobias Stadler Date: Tue, 6 Aug 2024 17:13:59 +0200 Subject: [PATCH] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20change?= =?UTF-8?q?s=20to=20main=20this=20commit=20is=20based=20on?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Created using spr 1.3.6-bogner-wip [skip ci] --- llvm/lib/CodeGen/GlobalISel/Combiner.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/llvm/lib/CodeGen/GlobalISel/Combiner.cpp b/llvm/lib/CodeGen/GlobalISel/Combiner.cpp index 3310ce5455c978..5da9e86b207618 100644 --- a/llvm/lib/CodeGen/GlobalISel/Combiner.cpp +++ b/llvm/lib/CodeGen/GlobalISel/Combiner.cpp @@ -153,7 +153,7 @@ bool Combiner::combineMachineInstrs() { // down RPOT. Changed = false; -RAIIDelegateInstaller DelInstall(MF, ObserverWrapper.get()); +RAIIMFObsDelInstaller DelInstall(MF, *ObserverWrapper); for (MachineBasicBlock *MBB : post_order(&MF)) { for (MachineInstr &CurMI : llvm::make_early_inc_range(llvm::reverse(*MBB))) { ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)
https://github.com/tobias-stadler updated https://github.com/llvm/llvm-project/pull/102163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)
https://github.com/tobias-stadler updated https://github.com/llvm/llvm-project/pull/102163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)
https://github.com/tobias-stadler updated https://github.com/llvm/llvm-project/pull/102163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)
https://github.com/tobias-stadler updated https://github.com/llvm/llvm-project/pull/102163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)
@@ -45,61 +45,190 @@ cl::OptionCategory GICombinerOptionCategory( ); } // end namespace llvm -/// This class acts as the glue the joins the CombinerHelper to the overall +/// This class acts as the glue that joins the CombinerHelper to the overall /// Combine algorithm. The CombinerHelper is intended to report the /// modifications it makes to the MIR to the GISelChangeObserver and the -/// observer subclass will act on these events. In this case, instruction -/// erasure will cancel any future visits to the erased instruction and -/// instruction creation will schedule that instruction for a future visit. -/// Other Combiner implementations may require more complex behaviour from -/// their GISelChangeObserver subclass. +/// observer subclass will act on these events. class Combiner::WorkListMaintainer : public GISelChangeObserver { - using WorkListTy = GISelWorkList<512>; - WorkListTy &WorkList; +protected: +#ifndef NDEBUG tobias-stadler wrote: Good point, but I don't think it's needed in this case, because you can't get the complete type of WorkListMaintainer outside of Combiner.cpp. Combiner.h only contains the declaration. Thoughts? https://github.com/llvm/llvm-project/pull/102163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)
https://github.com/tobias-stadler edited https://github.com/llvm/llvm-project/pull/102163 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)
https://github.com/tobias-stadler updated https://github.com/llvm/llvm-project/pull/102167 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)
https://github.com/tobias-stadler updated https://github.com/llvm/llvm-project/pull/102167 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [GlobalISel] Fix miscompile when narrowing vector load/stores to non-byte-sized types (PR #136739)
https://github.com/tobias-stadler created https://github.com/llvm/llvm-project/pull/136739 LegalizerHelper::reduceLoadStoreWidth does not work for non-byte-sized types, because this would require (un)packing of bits across byte boundaries. Precommit tests: #134904 >From e88a6e177837b478b4dc20def1b59f193b950965 Mon Sep 17 00:00:00 2001 From: Tobias Stadler Date: Wed, 9 Apr 2025 13:32:02 +0100 Subject: [PATCH] [GlobalISel] Fix miscompile when narrowing vector load/stores to non-byte-sized types LegalizerHelper::reduceLoadStoreWidth does not work for non-byte-sized types, because this would require (un)packing of bits across byte boundaries. --- .../CodeGen/GlobalISel/LegalizerHelper.cpp| 5 + .../GlobalISel/legalize-load-store-vector.mir | 104 +--- .../test/CodeGen/AArch64/fptosi-sat-vector.ll | 490 +++--- .../test/CodeGen/AArch64/fptoui-sat-vector.ll | 424 ++- 4 files changed, 368 insertions(+), 655 deletions(-) diff --git a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp index 0aa853389bf1a..4052060271331 100644 --- a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp +++ b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp @@ -5210,6 +5210,11 @@ LegalizerHelper::reduceLoadStoreWidth(GLoadStore &LdStMI, unsigned TypeIdx, if (TypeIdx != 0) return UnableToLegalize; + if (!NarrowTy.isByteSized()) { +LLVM_DEBUG(dbgs() << "Can't narrow load/store to non-byte-sized type\n"); +return UnableToLegalize; + } + // This implementation doesn't work for atomics. Give up instead of doing // something invalid. if (LdStMI.isAtomic()) diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir index 221980ff2c42e..3a2c57ab50147 100644 --- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir +++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir @@ -2,9 +2,11 @@ # RUN: llc -O0 -mtriple=aarch64 -verify-machineinstrs -run-pass=legalizer -global-isel-abort=0 -pass-remarks-missed='gisel.*' -o - %s 2> %t.err | FileCheck %s # RUN: FileCheck -check-prefix=ERR %s < %t.err -# ERR: remark: :0:0: unable to legalize instruction: %{{[0-9]+}}:_(s128) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<2 x s63>)) (in function: load-narrow-scalar-high-bits) +# ERR: remark: :0:0: unable to legalize instruction: G_STORE %{{[0-9]+}}:_(<8 x s9>), %{{[0-9]+}}:_(p0) :: (store (<8 x s9>), align 16) (in function: store-narrow-non-byte-sized) +# ERR-NEXT: remark: :0:0: unable to legalize instruction: %{{[0-9]+}}:_(<8 x s9>) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<8 x s9>), align 16) (in function: load-narrow-non-byte-sized) +# ERR-NEXT: remark: :0:0: unable to legalize instruction: %{{[0-9]+}}:_(s128) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<2 x s63>)) (in function: load-narrow-scalar-high-bits) -# FIXME: Scalarized stores for non-byte-sized vector elements store incorrect partial values. +# FIXME: Non-byte-sized vector elements cause fallback in LegalizerHelper::reduceLoadStoreWidth --- name:store-narrow-non-byte-sized tracksRegLiveness: true @@ -15,60 +17,10 @@ body: | ; CHECK: liveins: $x8 ; CHECK-NEXT: {{ $}} ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x8 -; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 256 -; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY [[C]](s32) -; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 511 -; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] -; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[AND]](s32) -; CHECK-NEXT: G_STORE [[TRUNC]](s16), [[COPY]](p0) :: (store (s16), align 16) -; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 1 -; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64) -; CHECK-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 257 -; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY [[C3]](s32) -; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY [[C1]](s32) -; CHECK-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[COPY3]] -; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[AND1]](s32) -; CHECK-NEXT: G_STORE [[TRUNC1]](s16), [[PTR_ADD]](p0) :: (store (s16) into unknown-address + 1, align 1) -; CHECK-NEXT: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 -; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64) -; CHECK-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C]](s32) -; CHECK-NEXT: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C1]](s32) -; CHECK-NEXT: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[COPY5]] -; CHECK-NEXT: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[AND2]](s32) -; CHECK-NEXT: G_STORE [[TRUNC2]](s16), [[PTR_ADD1]](p0) :: (store (s16) into unknown-address + 2) -; CHECK-NEXT: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 3 -; CHECK-NEXT: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C5]](s64) -
[llvm-branch-commits] [llvm] [GlobalISel] Fix miscompile when narrowing vector load/stores to non-byte-sized types (PR #136739)
https://github.com/tobias-stadler updated https://github.com/llvm/llvm-project/pull/136739 >From e88a6e177837b478b4dc20def1b59f193b950965 Mon Sep 17 00:00:00 2001 From: Tobias Stadler Date: Wed, 9 Apr 2025 13:32:02 +0100 Subject: [PATCH 1/2] [GlobalISel] Fix miscompile when narrowing vector load/stores to non-byte-sized types LegalizerHelper::reduceLoadStoreWidth does not work for non-byte-sized types, because this would require (un)packing of bits across byte boundaries. --- .../CodeGen/GlobalISel/LegalizerHelper.cpp| 5 + .../GlobalISel/legalize-load-store-vector.mir | 104 +--- .../test/CodeGen/AArch64/fptosi-sat-vector.ll | 490 +++--- .../test/CodeGen/AArch64/fptoui-sat-vector.ll | 424 ++- 4 files changed, 368 insertions(+), 655 deletions(-) diff --git a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp index 0aa853389bf1a..4052060271331 100644 --- a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp +++ b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp @@ -5210,6 +5210,11 @@ LegalizerHelper::reduceLoadStoreWidth(GLoadStore &LdStMI, unsigned TypeIdx, if (TypeIdx != 0) return UnableToLegalize; + if (!NarrowTy.isByteSized()) { +LLVM_DEBUG(dbgs() << "Can't narrow load/store to non-byte-sized type\n"); +return UnableToLegalize; + } + // This implementation doesn't work for atomics. Give up instead of doing // something invalid. if (LdStMI.isAtomic()) diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir index 221980ff2c42e..3a2c57ab50147 100644 --- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir +++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir @@ -2,9 +2,11 @@ # RUN: llc -O0 -mtriple=aarch64 -verify-machineinstrs -run-pass=legalizer -global-isel-abort=0 -pass-remarks-missed='gisel.*' -o - %s 2> %t.err | FileCheck %s # RUN: FileCheck -check-prefix=ERR %s < %t.err -# ERR: remark: :0:0: unable to legalize instruction: %{{[0-9]+}}:_(s128) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<2 x s63>)) (in function: load-narrow-scalar-high-bits) +# ERR: remark: :0:0: unable to legalize instruction: G_STORE %{{[0-9]+}}:_(<8 x s9>), %{{[0-9]+}}:_(p0) :: (store (<8 x s9>), align 16) (in function: store-narrow-non-byte-sized) +# ERR-NEXT: remark: :0:0: unable to legalize instruction: %{{[0-9]+}}:_(<8 x s9>) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<8 x s9>), align 16) (in function: load-narrow-non-byte-sized) +# ERR-NEXT: remark: :0:0: unable to legalize instruction: %{{[0-9]+}}:_(s128) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<2 x s63>)) (in function: load-narrow-scalar-high-bits) -# FIXME: Scalarized stores for non-byte-sized vector elements store incorrect partial values. +# FIXME: Non-byte-sized vector elements cause fallback in LegalizerHelper::reduceLoadStoreWidth --- name:store-narrow-non-byte-sized tracksRegLiveness: true @@ -15,60 +17,10 @@ body: | ; CHECK: liveins: $x8 ; CHECK-NEXT: {{ $}} ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x8 -; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 256 -; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY [[C]](s32) -; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 511 -; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]] -; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[AND]](s32) -; CHECK-NEXT: G_STORE [[TRUNC]](s16), [[COPY]](p0) :: (store (s16), align 16) -; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 1 -; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64) -; CHECK-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 257 -; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY [[C3]](s32) -; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY [[C1]](s32) -; CHECK-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[COPY3]] -; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[AND1]](s32) -; CHECK-NEXT: G_STORE [[TRUNC1]](s16), [[PTR_ADD]](p0) :: (store (s16) into unknown-address + 1, align 1) -; CHECK-NEXT: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 2 -; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64) -; CHECK-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C]](s32) -; CHECK-NEXT: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C1]](s32) -; CHECK-NEXT: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[COPY5]] -; CHECK-NEXT: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[AND2]](s32) -; CHECK-NEXT: G_STORE [[TRUNC2]](s16), [[PTR_ADD1]](p0) :: (store (s16) into unknown-address + 2) -; CHECK-NEXT: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 3 -; CHECK-NEXT: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C5]](s64) -; CHECK-NEXT: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C3]](s32) -; CHECK-NEXT: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C1]](s32) -; CHECK-NEXT: [[AND3:%[0-9]+]]:_(s32) = G_AN
[llvm-branch-commits] [llvm] [InlineCost] Simplify extractvalue across callsite (PR #145054)
https://github.com/tobias-stadler edited https://github.com/llvm/llvm-project/pull/145054 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [InlineCost] Simplify extractvalue across callsite (PR #145054)
tobias-stadler wrote: Oops, x86 has lower cost than AArch64. Thanks! https://github.com/llvm/llvm-project/pull/145054 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [InlineCost] Simplify extractvalue across callsite (PR #145054)
https://github.com/tobias-stadler edited https://github.com/llvm/llvm-project/pull/145054 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [Remarks] Auto-detect remark parser format (PR #144554)
https://github.com/tobias-stadler created https://github.com/llvm/llvm-project/pull/144554 Add remark format 'Auto', which performs automatic detection of the remark format using the magic numbers at the beginning of the remarks files. The RemarkLinker already did something similar, so we streamlined this and exposed this to llvm-remarkutil. Depends on #144527 >From a428e237fcc52830549144bf3afdcddb29742b0d Mon Sep 17 00:00:00 2001 From: Tobias Stadler Date: Mon, 16 Jun 2025 15:32:15 +0100 Subject: [PATCH] [Remarks] Auto-detect remark parser format Add remark format 'Auto', which performs automatic detection of the remark format using the magic numbers at the beginning of the remarks files. The RemarkLinker already did something similar, so we streamlined this and exposed this to llvm-remarkutil. Depends on #144527 --- llvm/include/llvm/Remarks/RemarkFormat.h | 5 - llvm/include/llvm/Remarks/RemarkLinker.h | 5 ++--- llvm/lib/Remarks/RemarkFormat.cpp | 18 +++- llvm/lib/Remarks/RemarkLinker.cpp | 14 +++-- llvm/lib/Remarks/RemarkParser.cpp | 21 +-- llvm/lib/Remarks/RemarkSerializer.cpp | 6 -- .../Inputs/broken-remark-magic.bitstream | 1 + .../llvm-remarkutil/annotation-count.test | 2 ++ .../broken-bitstream-remark-magic.test| 6 ++ .../tools/llvm-remarkutil/empty-file.test | 5 + .../llvm-remarkutil/instruction-count.test| 4 +++- .../llvm-remarkutil/instruction-mix.test | 4 +++- .../size-diff/no-difference.test | 3 +++ .../tools/llvm-remarkutil/RemarkUtilHelpers.h | 9 +--- llvm/unittests/Remarks/RemarksLinkingTest.cpp | 4 +--- 15 files changed, 75 insertions(+), 32 deletions(-) create mode 100644 llvm/test/tools/llvm-remarkutil/Inputs/broken-remark-magic.bitstream create mode 100644 llvm/test/tools/llvm-remarkutil/broken-bitstream-remark-magic.test diff --git a/llvm/include/llvm/Remarks/RemarkFormat.h b/llvm/include/llvm/Remarks/RemarkFormat.h index a39a013dcf905..eda201d4ee6f1 100644 --- a/llvm/include/llvm/Remarks/RemarkFormat.h +++ b/llvm/include/llvm/Remarks/RemarkFormat.h @@ -23,7 +23,7 @@ namespace remarks { constexpr StringLiteral Magic("REMARKS"); /// The format used for serializing/deserializing remarks. -enum class Format { Unknown, YAML, Bitstream }; +enum class Format { Unknown, Auto, YAML, Bitstream }; /// Parse and validate a string for the remark format. LLVM_ABI Expected parseFormat(StringRef FormatStr); @@ -31,6 +31,9 @@ LLVM_ABI Expected parseFormat(StringRef FormatStr); /// Parse and validate a magic number to a remark format. LLVM_ABI Expected magicToFormat(StringRef Magic); +/// Detect format based on selected format and magic number +LLVM_ABI Expected detectFormat(Format Selected, StringRef Magic); + } // end namespace remarks } // end namespace llvm diff --git a/llvm/include/llvm/Remarks/RemarkLinker.h b/llvm/include/llvm/Remarks/RemarkLinker.h index 5343c62144708..67208f40592a5 100644 --- a/llvm/include/llvm/Remarks/RemarkLinker.h +++ b/llvm/include/llvm/Remarks/RemarkLinker.h @@ -80,13 +80,12 @@ struct RemarkLinker { /// \p Buffer. /// \p Buffer can be either a standalone remark container or just /// metadata. This takes care of uniquing and merging the remarks. - LLVM_ABI Error link(StringRef Buffer, - std::optional RemarkFormat = std::nullopt); + LLVM_ABI Error link(StringRef Buffer, Format RemarkFormat = Format::Auto); /// Link the remarks found in \p Obj by looking for the right section and /// calling the method above. LLVM_ABI Error link(const object::ObjectFile &Obj, - std::optional RemarkFormat = std::nullopt); + Format RemarkFormat = Format::Auto); /// Serialize the linked remarks to the stream \p OS, using the format \p /// RemarkFormat. diff --git a/llvm/lib/Remarks/RemarkFormat.cpp b/llvm/lib/Remarks/RemarkFormat.cpp index 800f5bffe70da..1c52e352f9392 100644 --- a/llvm/lib/Remarks/RemarkFormat.cpp +++ b/llvm/lib/Remarks/RemarkFormat.cpp @@ -42,6 +42,22 @@ Expected llvm::remarks::magicToFormat(StringRef MagicStr) { if (Result == Format::Unknown) return createStringError(std::make_error_code(std::errc::invalid_argument), - "Unknown remark magic: '%s'", MagicStr.data()); + "Automatic detection of remark format failed. " + "Unknown magic number: '%.4s'", + MagicStr.data()); return Result; } + +Expected llvm::remarks::detectFormat(Format Selected, + StringRef MagicStr) { + if (Selected == Format::Unknown) +return createStringError(std::make_error_code(std::errc::invalid_argument), + "Unknown remark parser format."); + if (Selected != Format::Auto) +return Selected; + + //
[llvm-branch-commits] [llvm] [Remarks] Auto-detect remark parser format (PR #144554)
https://github.com/tobias-stadler updated https://github.com/llvm/llvm-project/pull/144554 >From a428e237fcc52830549144bf3afdcddb29742b0d Mon Sep 17 00:00:00 2001 From: Tobias Stadler Date: Mon, 16 Jun 2025 15:32:15 +0100 Subject: [PATCH 1/2] [Remarks] Auto-detect remark parser format Add remark format 'Auto', which performs automatic detection of the remark format using the magic numbers at the beginning of the remarks files. The RemarkLinker already did something similar, so we streamlined this and exposed this to llvm-remarkutil. Depends on #144527 --- llvm/include/llvm/Remarks/RemarkFormat.h | 5 - llvm/include/llvm/Remarks/RemarkLinker.h | 5 ++--- llvm/lib/Remarks/RemarkFormat.cpp | 18 +++- llvm/lib/Remarks/RemarkLinker.cpp | 14 +++-- llvm/lib/Remarks/RemarkParser.cpp | 21 +-- llvm/lib/Remarks/RemarkSerializer.cpp | 6 -- .../Inputs/broken-remark-magic.bitstream | 1 + .../llvm-remarkutil/annotation-count.test | 2 ++ .../broken-bitstream-remark-magic.test| 6 ++ .../tools/llvm-remarkutil/empty-file.test | 5 + .../llvm-remarkutil/instruction-count.test| 4 +++- .../llvm-remarkutil/instruction-mix.test | 4 +++- .../size-diff/no-difference.test | 3 +++ .../tools/llvm-remarkutil/RemarkUtilHelpers.h | 9 +--- llvm/unittests/Remarks/RemarksLinkingTest.cpp | 4 +--- 15 files changed, 75 insertions(+), 32 deletions(-) create mode 100644 llvm/test/tools/llvm-remarkutil/Inputs/broken-remark-magic.bitstream create mode 100644 llvm/test/tools/llvm-remarkutil/broken-bitstream-remark-magic.test diff --git a/llvm/include/llvm/Remarks/RemarkFormat.h b/llvm/include/llvm/Remarks/RemarkFormat.h index a39a013dcf905..eda201d4ee6f1 100644 --- a/llvm/include/llvm/Remarks/RemarkFormat.h +++ b/llvm/include/llvm/Remarks/RemarkFormat.h @@ -23,7 +23,7 @@ namespace remarks { constexpr StringLiteral Magic("REMARKS"); /// The format used for serializing/deserializing remarks. -enum class Format { Unknown, YAML, Bitstream }; +enum class Format { Unknown, Auto, YAML, Bitstream }; /// Parse and validate a string for the remark format. LLVM_ABI Expected parseFormat(StringRef FormatStr); @@ -31,6 +31,9 @@ LLVM_ABI Expected parseFormat(StringRef FormatStr); /// Parse and validate a magic number to a remark format. LLVM_ABI Expected magicToFormat(StringRef Magic); +/// Detect format based on selected format and magic number +LLVM_ABI Expected detectFormat(Format Selected, StringRef Magic); + } // end namespace remarks } // end namespace llvm diff --git a/llvm/include/llvm/Remarks/RemarkLinker.h b/llvm/include/llvm/Remarks/RemarkLinker.h index 5343c62144708..67208f40592a5 100644 --- a/llvm/include/llvm/Remarks/RemarkLinker.h +++ b/llvm/include/llvm/Remarks/RemarkLinker.h @@ -80,13 +80,12 @@ struct RemarkLinker { /// \p Buffer. /// \p Buffer can be either a standalone remark container or just /// metadata. This takes care of uniquing and merging the remarks. - LLVM_ABI Error link(StringRef Buffer, - std::optional RemarkFormat = std::nullopt); + LLVM_ABI Error link(StringRef Buffer, Format RemarkFormat = Format::Auto); /// Link the remarks found in \p Obj by looking for the right section and /// calling the method above. LLVM_ABI Error link(const object::ObjectFile &Obj, - std::optional RemarkFormat = std::nullopt); + Format RemarkFormat = Format::Auto); /// Serialize the linked remarks to the stream \p OS, using the format \p /// RemarkFormat. diff --git a/llvm/lib/Remarks/RemarkFormat.cpp b/llvm/lib/Remarks/RemarkFormat.cpp index 800f5bffe70da..1c52e352f9392 100644 --- a/llvm/lib/Remarks/RemarkFormat.cpp +++ b/llvm/lib/Remarks/RemarkFormat.cpp @@ -42,6 +42,22 @@ Expected llvm::remarks::magicToFormat(StringRef MagicStr) { if (Result == Format::Unknown) return createStringError(std::make_error_code(std::errc::invalid_argument), - "Unknown remark magic: '%s'", MagicStr.data()); + "Automatic detection of remark format failed. " + "Unknown magic number: '%.4s'", + MagicStr.data()); return Result; } + +Expected llvm::remarks::detectFormat(Format Selected, + StringRef MagicStr) { + if (Selected == Format::Unknown) +return createStringError(std::make_error_code(std::errc::invalid_argument), + "Unknown remark parser format."); + if (Selected != Format::Auto) +return Selected; + + // Empty files are valid bitstream files + if (MagicStr.empty()) +return Format::Bitstream; + return magicToFormat(MagicStr); +} diff --git a/llvm/lib/Remarks/RemarkLinker.cpp b/llvm/lib/Remarks/RemarkLinker.cpp index b8395aa135d82..0ca6217edfddd 100644 --- a/llvm/lib
[llvm-branch-commits] [llvm] [InlineCost] Simplify extractvalue across callsite (PR #145054)
https://github.com/tobias-stadler closed https://github.com/llvm/llvm-project/pull/145054 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [InlineCost] Simplify extractvalue across callsite (PR #145054)
https://github.com/tobias-stadler reopened https://github.com/llvm/llvm-project/pull/145054 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits