[llvm-branch-commits] [GlobalISel] Don't remove from unfinalized GISelWorkList (PR #102158)

2024-08-06 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler created 
https://github.com/llvm/llvm-project/pull/102158

Remove a hack from GISelWorkList caused by the Combiner removing
instructions from an unfinalized GISelWorkList during the DCE phase.
This is in preparation for larger changes to the WorkListMaintainer.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)

2024-08-06 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler created 
https://github.com/llvm/llvm-project/pull/102163

Continues the work for disabling fixed-point iteration in the Combiner
(#94291).

This introduces improved Observer-based heuristics in the
GISel Combiner to retry combining defs/uses of modified instructions and
for performing sparse dead code elimination.

I have experimented a lot with the heuristics and this seems to be the
minimal set of heuristics that allows disabling fixed-point iteration
for AArch64 CTMark O2 without regressions.
Enabling this globally would pass all regression tests for all official
targets (apart from small benign diffs), but I have made this fully
opt-in for now, because I can't quantify the impact for other targets.

For performance numbers see my follow-up patch for AArch64.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)

2024-08-06 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler created 
https://github.com/llvm/llvm-project/pull/102167

Disable fixed-point iteration in all AArch64 Combiners after #102163.
See inline comments for justification of test changes.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)

2024-08-06 Thread Tobias Stadler via llvm-branch-commits

tobias-stadler wrote:

CTMark O0:
```
Program   compile_instructions  
  size..text
  base-O0  patch-O0 
   diff   base-O0patch-O0   diff
7zip/7zip-benchmark   141676688007.00  
141409424579.00 -0.19% 1004756.00 1004756.00 0.00%
Bullet/bullet  61212623785.00   
61017628199.00 -0.32%  873356.00  873356.00 0.00%
SPASS/SPASS14400540825.00   
14348447793.00 -0.36%  645136.00  645136.00 0.00%
tramp3d-v4/tramp3d-v4  18450535356.00   
18380583241.00 -0.38%  894656.00  894656.00 0.00%
kimwitu++/kc   22538775759.00   
22426150589.00 -0.50%  813492.00  813492.00 0.00%
ClamAV/clamscan14536543546.00   
14378298132.00 -1.09%  702832.00  702832.00 0.00%
lencod/lencod  12679661465.00   
12525031047.00 -1.22%  820564.00  820564.00 0.00%
mafft/pairlocalalign6940116013.00
6851372697.00 -1.28%  423000.00  423000.00 0.00%
consumer-typeset/consumer-typeset  12193354451.00   
12003246725.00 -1.56%  908844.00  908844.00 0.00%
sqlite3/sqlite3 4596588654.00
4512822506.00 -1.82%  459904.00  459904.00 0.00%
   Geomean difference   
   -0.87%   0.00%

```

CTMark O2:
```
Program   compile_instructions  
  size..text
  base-O2  patch-O2 
   diff   base-O2patch-O2  diff
7zip/7zip-benchmark   207008981549.00  
205988150072.00 -0.49% 824088.00  824088.00 0.00%
Bullet/bullet  99712646588.00   
99179976524.00 -0.53% 515864.00  515864.00 0.00%
SPASS/SPASS43192888774.00   
42858161952.00 -0.77% 443056.00  443056.00 0.00%
kimwitu++/kc   41563609655.00   
41217223921.00 -0.83% 461284.00  461284.00 0.00%
tramp3d-v4/tramp3d-v4  65778761906.00   
6520869.00 -0.87% 570560.00  570560.00 0.00%
lencod/lencod  57111009799.00   
56497616800.00 -1.07% 541960.00  541960.00 0.00%
ClamAV/clamscan53935475581.00   
53350116553.00 -1.09% 456352.00  456352.00 0.00%
mafft/pairlocalalign   32799063008.00   
32439200934.00 -1.10% 320888.00  320888.00 0.00%
consumer-typeset/consumer-typeset  35357798970.00   
34865643450.00 -1.39% 419436.00  419436.00 0.00%
sqlite3/sqlite335219936375.00   
34678565893.00 -1.54% 436160.00  436160.00 0.00%
   Geomean difference   
   -0.97%  0.00%

```

https://github.com/llvm/llvm-project/pull/102167
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)

2024-08-06 Thread Tobias Stadler via llvm-branch-commits


@@ -257,10 +257,10 @@ define i32 @udiv_div_by_180(i32 %x)
 ;
 ; GISEL-LABEL: udiv_div_by_180:
 ; GISEL:   // %bb.0:
-; GISEL-NEXT:uxtb w8, w0
-; GISEL-NEXT:mov w9, #5826 // =0x16c2
-; GISEL-NEXT:movk w9, #364, lsl #16
-; GISEL-NEXT:umull x8, w8, w9
+; GISEL-NEXT:mov w8, #5826 // =0x16c2
+; GISEL-NEXT:and w9, w0, #0xff
+; GISEL-NEXT:movk w8, #364, lsl #16
+; GISEL-NEXT:umull x8, w9, w8

tobias-stadler wrote:

expanding the div generates a bunch of instructions and the order in which they 
are recombined results in different ouput

https://github.com/llvm/llvm-project/pull/102167
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)

2024-08-06 Thread Tobias Stadler via llvm-branch-commits


@@ -9,6 +9,8 @@ body: |
   bb.1:
 ; CHECK-LABEL: name: crash_fn
 ; CHECK: [[C:%[0-9]+]]:_(p0) = G_CONSTANT i64 0
+; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 0
+; CHECK-NEXT: [[C2:%[0-9]+]]:_(s1) = G_CONSTANT i1 false

tobias-stadler wrote:

These are dead in the input IR, which doesn't happen inside the CodeGen 
pipeline.

https://github.com/llvm/llvm-project/pull/102167
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)

2024-08-06 Thread Tobias Stadler via llvm-branch-commits


@@ -88,7 +95,8 @@ body: |
 %cmp2:_(s32) = G_ICMP intpred(sge), %cmp_lhs(s64), %add
 
 $w0 = COPY %cmp2(s32)
-RET_ReallyLR implicit $w0
+$w1 = COPY %cmp1(s32)

tobias-stadler wrote:

cmp1 was dead in the input IR, but according to the comment this seems 
unintended.

https://github.com/llvm/llvm-project/pull/102167
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)

2024-08-06 Thread Tobias Stadler via llvm-branch-commits


@@ -33,8 +33,7 @@ define noundef i1 @logger(i32 noundef %logLevel, ptr %ea, ptr 
%pll) {
 ; CHECK-GI-NEXT:b.hi .LBB1_2
 ; CHECK-GI-NEXT:  // %bb.1: // %land.rhs
 ; CHECK-GI-NEXT:ldr x8, [x1]
-; CHECK-GI-NEXT:ldrb w8, [x8]
-; CHECK-GI-NEXT:and w0, w8, #0x1
+; CHECK-GI-NEXT:ldrb w0, [x8]

tobias-stadler wrote:

The immediate DCEing and better ordering of combines prevents the 
PreLegalizerCombiner from generating a bunch of useless artifacts that it can't 
combine away again. These artifacts are converted into G_AND by the 
ArtifactCombiner, which can't be combined away by the redundant_and combine, 
because KnownBits can't look through the implicit anyext of the G_LOAD. We are 
probably missing some combines that convert the load into sext/zext versions.

https://github.com/llvm/llvm-project/pull/102167
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)

2024-08-06 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler edited 
https://github.com/llvm/llvm-project/pull/102167
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)

2024-08-06 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler edited 
https://github.com/llvm/llvm-project/pull/102163
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)

2024-08-06 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler edited 
https://github.com/llvm/llvm-project/pull/102167
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)

2024-08-06 Thread Tobias Stadler via llvm-branch-commits

tobias-stadler wrote:

> Is this a fundamental issue of the combiner or do we have to revisit the 
> situation when the combiner becomes slowly more powerful?

This is a fundamental issue of combiner-style algorithms. Fixed-point iteration 
just burns too much compile-time for no good reason. Both the InstCombiner and 
the DAGCombiner don't use fixed-point iteration anymore for the same 
compile-time reasons. The heuristics I have implemented should be mostly on-par 
with how thoroughly the other combiner implementations currently handle 
retrying combines. The main difference is that we can rely on the Observer to 
detect changes for retrying combines and the other combiner implementations 
need to use WorkList-aware functions for performing changes on the IR. If this 
approach is good enough for SelectionDAG, I don't see this approach becoming a 
problem for GlobalISel.

https://github.com/llvm/llvm-project/pull/102167
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)

2024-08-06 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler edited 
https://github.com/llvm/llvm-project/pull/102163
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)

2024-08-06 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler edited 
https://github.com/llvm/llvm-project/pull/102163
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [GlobalISel] Don't remove from unfinalized GISelWorkList (PR #102158)

2024-08-11 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler updated 
https://github.com/llvm/llvm-project/pull/102158

>From 1f5757a4c3989755623d66c43575c858dcb13f75 Mon Sep 17 00:00:00 2001
From: Tobias Stadler 
Date: Tue, 6 Aug 2024 17:13:59 +0200
Subject: [PATCH] =?UTF-8?q?[=F0=9D=98=80=F0=9D=97=BD=F0=9D=97=BF]=20change?=
 =?UTF-8?q?s=20to=20main=20this=20commit=20is=20based=20on?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Created using spr 1.3.6-bogner-wip

[skip ci]
---
 llvm/lib/CodeGen/GlobalISel/Combiner.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/llvm/lib/CodeGen/GlobalISel/Combiner.cpp 
b/llvm/lib/CodeGen/GlobalISel/Combiner.cpp
index 3310ce5455c978..5da9e86b207618 100644
--- a/llvm/lib/CodeGen/GlobalISel/Combiner.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/Combiner.cpp
@@ -153,7 +153,7 @@ bool Combiner::combineMachineInstrs() {
 // down RPOT.
 Changed = false;
 
-RAIIDelegateInstaller DelInstall(MF, ObserverWrapper.get());
+RAIIMFObsDelInstaller DelInstall(MF, *ObserverWrapper);
 for (MachineBasicBlock *MBB : post_order(&MF)) {
   for (MachineInstr &CurMI :
llvm::make_early_inc_range(llvm::reverse(*MBB))) {

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)

2024-08-12 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler updated 
https://github.com/llvm/llvm-project/pull/102163


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)

2024-08-12 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler updated 
https://github.com/llvm/llvm-project/pull/102163


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)

2024-08-14 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler updated 
https://github.com/llvm/llvm-project/pull/102163


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)

2024-08-14 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler updated 
https://github.com/llvm/llvm-project/pull/102163


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)

2024-08-15 Thread Tobias Stadler via llvm-branch-commits


@@ -45,61 +45,190 @@ cl::OptionCategory GICombinerOptionCategory(
 );
 } // end namespace llvm
 
-/// This class acts as the glue the joins the CombinerHelper to the overall
+/// This class acts as the glue that joins the CombinerHelper to the overall
 /// Combine algorithm. The CombinerHelper is intended to report the
 /// modifications it makes to the MIR to the GISelChangeObserver and the
-/// observer subclass will act on these events. In this case, instruction
-/// erasure will cancel any future visits to the erased instruction and
-/// instruction creation will schedule that instruction for a future visit.
-/// Other Combiner implementations may require more complex behaviour from
-/// their GISelChangeObserver subclass.
+/// observer subclass will act on these events.
 class Combiner::WorkListMaintainer : public GISelChangeObserver {
-  using WorkListTy = GISelWorkList<512>;
-  WorkListTy &WorkList;
+protected:
+#ifndef NDEBUG

tobias-stadler wrote:

Good point, but I don't think it's needed in this case, because you can't get 
the complete type of WorkListMaintainer outside of Combiner.cpp. Combiner.h 
only contains the declaration. Thoughts?

https://github.com/llvm/llvm-project/pull/102163
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [GlobalISel] Combiner: Observer-based DCE and retrying of combines (PR #102163)

2024-08-15 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler edited 
https://github.com/llvm/llvm-project/pull/102163
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)

2024-08-16 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler updated 
https://github.com/llvm/llvm-project/pull/102167


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [AArch64][GlobalISel] Disable fixed-point iteration in all Combiners (PR #102167)

2024-08-16 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler updated 
https://github.com/llvm/llvm-project/pull/102167


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [GlobalISel] Fix miscompile when narrowing vector load/stores to non-byte-sized types (PR #136739)

2025-04-22 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler created 
https://github.com/llvm/llvm-project/pull/136739

LegalizerHelper::reduceLoadStoreWidth does not work for non-byte-sized types, 
because this would require (un)packing of bits across byte boundaries.

Precommit tests: #134904

>From e88a6e177837b478b4dc20def1b59f193b950965 Mon Sep 17 00:00:00 2001
From: Tobias Stadler 
Date: Wed, 9 Apr 2025 13:32:02 +0100
Subject: [PATCH] [GlobalISel] Fix miscompile when narrowing vector load/stores
 to non-byte-sized types

LegalizerHelper::reduceLoadStoreWidth does not work for non-byte-sized
types, because this would require (un)packing of bits across byte
boundaries.
---
 .../CodeGen/GlobalISel/LegalizerHelper.cpp|   5 +
 .../GlobalISel/legalize-load-store-vector.mir | 104 +---
 .../test/CodeGen/AArch64/fptosi-sat-vector.ll | 490 +++---
 .../test/CodeGen/AArch64/fptoui-sat-vector.ll | 424 ++-
 4 files changed, 368 insertions(+), 655 deletions(-)

diff --git a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp 
b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
index 0aa853389bf1a..4052060271331 100644
--- a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
@@ -5210,6 +5210,11 @@ LegalizerHelper::reduceLoadStoreWidth(GLoadStore 
&LdStMI, unsigned TypeIdx,
   if (TypeIdx != 0)
 return UnableToLegalize;
 
+  if (!NarrowTy.isByteSized()) {
+LLVM_DEBUG(dbgs() << "Can't narrow load/store to non-byte-sized type\n");
+return UnableToLegalize;
+  }
+
   // This implementation doesn't work for atomics. Give up instead of doing
   // something invalid.
   if (LdStMI.isAtomic())
diff --git 
a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir 
b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir
index 221980ff2c42e..3a2c57ab50147 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir
@@ -2,9 +2,11 @@
 # RUN: llc -O0 -mtriple=aarch64 -verify-machineinstrs -run-pass=legalizer 
-global-isel-abort=0 -pass-remarks-missed='gisel.*' -o - %s 2> %t.err | 
FileCheck %s
 # RUN: FileCheck -check-prefix=ERR %s < %t.err
 
-# ERR: remark: :0:0: unable to legalize instruction: 
%{{[0-9]+}}:_(s128) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<2 x s63>)) (in 
function: load-narrow-scalar-high-bits)
+# ERR: remark: :0:0: unable to legalize instruction: G_STORE 
%{{[0-9]+}}:_(<8 x s9>), %{{[0-9]+}}:_(p0) :: (store (<8 x s9>), align 16) (in 
function: store-narrow-non-byte-sized)
+# ERR-NEXT: remark: :0:0: unable to legalize instruction: 
%{{[0-9]+}}:_(<8 x s9>) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<8 x s9>), align 
16) (in function: load-narrow-non-byte-sized)
+# ERR-NEXT: remark: :0:0: unable to legalize instruction: 
%{{[0-9]+}}:_(s128) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<2 x s63>)) (in 
function: load-narrow-scalar-high-bits)
 
-# FIXME: Scalarized stores for non-byte-sized vector elements store incorrect 
partial values.
+# FIXME: Non-byte-sized vector elements cause fallback in 
LegalizerHelper::reduceLoadStoreWidth
 ---
 name:store-narrow-non-byte-sized
 tracksRegLiveness: true
@@ -15,60 +17,10 @@ body: |
 ; CHECK: liveins: $x8
 ; CHECK-NEXT: {{  $}}
 ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x8
-; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 256
-; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY [[C]](s32)
-; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 511
-; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]]
-; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[AND]](s32)
-; CHECK-NEXT: G_STORE [[TRUNC]](s16), [[COPY]](p0) :: (store (s16), align 
16)
-; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
-; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)
-; CHECK-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 257
-; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY [[C3]](s32)
-; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY [[C1]](s32)
-; CHECK-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[COPY3]]
-; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[AND1]](s32)
-; CHECK-NEXT: G_STORE [[TRUNC1]](s16), [[PTR_ADD]](p0) :: (store (s16) 
into unknown-address + 1, align 1)
-; CHECK-NEXT: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
-; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64)
-; CHECK-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C]](s32)
-; CHECK-NEXT: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C1]](s32)
-; CHECK-NEXT: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[COPY5]]
-; CHECK-NEXT: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[AND2]](s32)
-; CHECK-NEXT: G_STORE [[TRUNC2]](s16), [[PTR_ADD1]](p0) :: (store (s16) 
into unknown-address + 2)
-; CHECK-NEXT: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
-; CHECK-NEXT: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C5]](s64)
- 

[llvm-branch-commits] [llvm] [GlobalISel] Fix miscompile when narrowing vector load/stores to non-byte-sized types (PR #136739)

2025-04-23 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler updated 
https://github.com/llvm/llvm-project/pull/136739

>From e88a6e177837b478b4dc20def1b59f193b950965 Mon Sep 17 00:00:00 2001
From: Tobias Stadler 
Date: Wed, 9 Apr 2025 13:32:02 +0100
Subject: [PATCH 1/2] [GlobalISel] Fix miscompile when narrowing vector
 load/stores to non-byte-sized types

LegalizerHelper::reduceLoadStoreWidth does not work for non-byte-sized
types, because this would require (un)packing of bits across byte
boundaries.
---
 .../CodeGen/GlobalISel/LegalizerHelper.cpp|   5 +
 .../GlobalISel/legalize-load-store-vector.mir | 104 +---
 .../test/CodeGen/AArch64/fptosi-sat-vector.ll | 490 +++---
 .../test/CodeGen/AArch64/fptoui-sat-vector.ll | 424 ++-
 4 files changed, 368 insertions(+), 655 deletions(-)

diff --git a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp 
b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
index 0aa853389bf1a..4052060271331 100644
--- a/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+++ b/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
@@ -5210,6 +5210,11 @@ LegalizerHelper::reduceLoadStoreWidth(GLoadStore 
&LdStMI, unsigned TypeIdx,
   if (TypeIdx != 0)
 return UnableToLegalize;
 
+  if (!NarrowTy.isByteSized()) {
+LLVM_DEBUG(dbgs() << "Can't narrow load/store to non-byte-sized type\n");
+return UnableToLegalize;
+  }
+
   // This implementation doesn't work for atomics. Give up instead of doing
   // something invalid.
   if (LdStMI.isAtomic())
diff --git 
a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir 
b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir
index 221980ff2c42e..3a2c57ab50147 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store-vector.mir
@@ -2,9 +2,11 @@
 # RUN: llc -O0 -mtriple=aarch64 -verify-machineinstrs -run-pass=legalizer 
-global-isel-abort=0 -pass-remarks-missed='gisel.*' -o - %s 2> %t.err | 
FileCheck %s
 # RUN: FileCheck -check-prefix=ERR %s < %t.err
 
-# ERR: remark: :0:0: unable to legalize instruction: 
%{{[0-9]+}}:_(s128) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<2 x s63>)) (in 
function: load-narrow-scalar-high-bits)
+# ERR: remark: :0:0: unable to legalize instruction: G_STORE 
%{{[0-9]+}}:_(<8 x s9>), %{{[0-9]+}}:_(p0) :: (store (<8 x s9>), align 16) (in 
function: store-narrow-non-byte-sized)
+# ERR-NEXT: remark: :0:0: unable to legalize instruction: 
%{{[0-9]+}}:_(<8 x s9>) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<8 x s9>), align 
16) (in function: load-narrow-non-byte-sized)
+# ERR-NEXT: remark: :0:0: unable to legalize instruction: 
%{{[0-9]+}}:_(s128) = G_LOAD %{{[0-9]+}}:_(p0) :: (load (<2 x s63>)) (in 
function: load-narrow-scalar-high-bits)
 
-# FIXME: Scalarized stores for non-byte-sized vector elements store incorrect 
partial values.
+# FIXME: Non-byte-sized vector elements cause fallback in 
LegalizerHelper::reduceLoadStoreWidth
 ---
 name:store-narrow-non-byte-sized
 tracksRegLiveness: true
@@ -15,60 +17,10 @@ body: |
 ; CHECK: liveins: $x8
 ; CHECK-NEXT: {{  $}}
 ; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x8
-; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 256
-; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY [[C]](s32)
-; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 511
-; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C1]]
-; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[AND]](s32)
-; CHECK-NEXT: G_STORE [[TRUNC]](s16), [[COPY]](p0) :: (store (s16), align 
16)
-; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
-; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)
-; CHECK-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 257
-; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY [[C3]](s32)
-; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(s32) = COPY [[C1]](s32)
-; CHECK-NEXT: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY2]], [[COPY3]]
-; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[AND1]](s32)
-; CHECK-NEXT: G_STORE [[TRUNC1]](s16), [[PTR_ADD]](p0) :: (store (s16) 
into unknown-address + 1, align 1)
-; CHECK-NEXT: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
-; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64)
-; CHECK-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY [[C]](s32)
-; CHECK-NEXT: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C1]](s32)
-; CHECK-NEXT: [[AND2:%[0-9]+]]:_(s32) = G_AND [[COPY4]], [[COPY5]]
-; CHECK-NEXT: [[TRUNC2:%[0-9]+]]:_(s16) = G_TRUNC [[AND2]](s32)
-; CHECK-NEXT: G_STORE [[TRUNC2]](s16), [[PTR_ADD1]](p0) :: (store (s16) 
into unknown-address + 2)
-; CHECK-NEXT: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
-; CHECK-NEXT: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C5]](s64)
-; CHECK-NEXT: [[COPY6:%[0-9]+]]:_(s32) = COPY [[C3]](s32)
-; CHECK-NEXT: [[COPY7:%[0-9]+]]:_(s32) = COPY [[C1]](s32)
-; CHECK-NEXT: [[AND3:%[0-9]+]]:_(s32) = G_AN

[llvm-branch-commits] [llvm] [InlineCost] Simplify extractvalue across callsite (PR #145054)

2025-06-23 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler edited 
https://github.com/llvm/llvm-project/pull/145054
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [InlineCost] Simplify extractvalue across callsite (PR #145054)

2025-06-23 Thread Tobias Stadler via llvm-branch-commits




tobias-stadler wrote:

Oops, x86 has lower cost than AArch64. Thanks!

https://github.com/llvm/llvm-project/pull/145054
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [InlineCost] Simplify extractvalue across callsite (PR #145054)

2025-06-23 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler edited 
https://github.com/llvm/llvm-project/pull/145054
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Remarks] Auto-detect remark parser format (PR #144554)

2025-06-17 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler created 
https://github.com/llvm/llvm-project/pull/144554

Add remark format 'Auto', which performs automatic detection of the remark 
format using the magic numbers at the beginning of the remarks files.

The RemarkLinker already did something similar, so we streamlined this and 
exposed this to llvm-remarkutil.

Depends on #144527

>From a428e237fcc52830549144bf3afdcddb29742b0d Mon Sep 17 00:00:00 2001
From: Tobias Stadler 
Date: Mon, 16 Jun 2025 15:32:15 +0100
Subject: [PATCH] [Remarks] Auto-detect remark parser format

Add remark format 'Auto', which performs automatic detection of the
remark format using the magic numbers at the beginning of the remarks
files.

The RemarkLinker already did something similar, so we streamlined this
and exposed this to llvm-remarkutil.

Depends on #144527
---
 llvm/include/llvm/Remarks/RemarkFormat.h  |  5 -
 llvm/include/llvm/Remarks/RemarkLinker.h  |  5 ++---
 llvm/lib/Remarks/RemarkFormat.cpp | 18 +++-
 llvm/lib/Remarks/RemarkLinker.cpp | 14 +++--
 llvm/lib/Remarks/RemarkParser.cpp | 21 +--
 llvm/lib/Remarks/RemarkSerializer.cpp |  6 --
 .../Inputs/broken-remark-magic.bitstream  |  1 +
 .../llvm-remarkutil/annotation-count.test |  2 ++
 .../broken-bitstream-remark-magic.test|  6 ++
 .../tools/llvm-remarkutil/empty-file.test |  5 +
 .../llvm-remarkutil/instruction-count.test|  4 +++-
 .../llvm-remarkutil/instruction-mix.test  |  4 +++-
 .../size-diff/no-difference.test  |  3 +++
 .../tools/llvm-remarkutil/RemarkUtilHelpers.h |  9 +---
 llvm/unittests/Remarks/RemarksLinkingTest.cpp |  4 +---
 15 files changed, 75 insertions(+), 32 deletions(-)
 create mode 100644 
llvm/test/tools/llvm-remarkutil/Inputs/broken-remark-magic.bitstream
 create mode 100644 
llvm/test/tools/llvm-remarkutil/broken-bitstream-remark-magic.test

diff --git a/llvm/include/llvm/Remarks/RemarkFormat.h 
b/llvm/include/llvm/Remarks/RemarkFormat.h
index a39a013dcf905..eda201d4ee6f1 100644
--- a/llvm/include/llvm/Remarks/RemarkFormat.h
+++ b/llvm/include/llvm/Remarks/RemarkFormat.h
@@ -23,7 +23,7 @@ namespace remarks {
 constexpr StringLiteral Magic("REMARKS");
 
 /// The format used for serializing/deserializing remarks.
-enum class Format { Unknown, YAML, Bitstream };
+enum class Format { Unknown, Auto, YAML, Bitstream };
 
 /// Parse and validate a string for the remark format.
 LLVM_ABI Expected parseFormat(StringRef FormatStr);
@@ -31,6 +31,9 @@ LLVM_ABI Expected parseFormat(StringRef FormatStr);
 /// Parse and validate a magic number to a remark format.
 LLVM_ABI Expected magicToFormat(StringRef Magic);
 
+/// Detect format based on selected format and magic number
+LLVM_ABI Expected detectFormat(Format Selected, StringRef Magic);
+
 } // end namespace remarks
 } // end namespace llvm
 
diff --git a/llvm/include/llvm/Remarks/RemarkLinker.h 
b/llvm/include/llvm/Remarks/RemarkLinker.h
index 5343c62144708..67208f40592a5 100644
--- a/llvm/include/llvm/Remarks/RemarkLinker.h
+++ b/llvm/include/llvm/Remarks/RemarkLinker.h
@@ -80,13 +80,12 @@ struct RemarkLinker {
   /// \p Buffer.
   /// \p Buffer can be either a standalone remark container or just
   /// metadata. This takes care of uniquing and merging the remarks.
-  LLVM_ABI Error link(StringRef Buffer,
-  std::optional RemarkFormat = std::nullopt);
+  LLVM_ABI Error link(StringRef Buffer, Format RemarkFormat = Format::Auto);
 
   /// Link the remarks found in \p Obj by looking for the right section and
   /// calling the method above.
   LLVM_ABI Error link(const object::ObjectFile &Obj,
-  std::optional RemarkFormat = std::nullopt);
+  Format RemarkFormat = Format::Auto);
 
   /// Serialize the linked remarks to the stream \p OS, using the format \p
   /// RemarkFormat.
diff --git a/llvm/lib/Remarks/RemarkFormat.cpp 
b/llvm/lib/Remarks/RemarkFormat.cpp
index 800f5bffe70da..1c52e352f9392 100644
--- a/llvm/lib/Remarks/RemarkFormat.cpp
+++ b/llvm/lib/Remarks/RemarkFormat.cpp
@@ -42,6 +42,22 @@ Expected llvm::remarks::magicToFormat(StringRef 
MagicStr) {
 
   if (Result == Format::Unknown)
 return createStringError(std::make_error_code(std::errc::invalid_argument),
- "Unknown remark magic: '%s'", MagicStr.data());
+ "Automatic detection of remark format failed. "
+ "Unknown magic number: '%.4s'",
+ MagicStr.data());
   return Result;
 }
+
+Expected llvm::remarks::detectFormat(Format Selected,
+ StringRef MagicStr) {
+  if (Selected == Format::Unknown)
+return createStringError(std::make_error_code(std::errc::invalid_argument),
+ "Unknown remark parser format.");
+  if (Selected != Format::Auto)
+return Selected;
+
+  //

[llvm-branch-commits] [llvm] [Remarks] Auto-detect remark parser format (PR #144554)

2025-06-17 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler updated 
https://github.com/llvm/llvm-project/pull/144554

>From a428e237fcc52830549144bf3afdcddb29742b0d Mon Sep 17 00:00:00 2001
From: Tobias Stadler 
Date: Mon, 16 Jun 2025 15:32:15 +0100
Subject: [PATCH 1/2] [Remarks] Auto-detect remark parser format

Add remark format 'Auto', which performs automatic detection of the
remark format using the magic numbers at the beginning of the remarks
files.

The RemarkLinker already did something similar, so we streamlined this
and exposed this to llvm-remarkutil.

Depends on #144527
---
 llvm/include/llvm/Remarks/RemarkFormat.h  |  5 -
 llvm/include/llvm/Remarks/RemarkLinker.h  |  5 ++---
 llvm/lib/Remarks/RemarkFormat.cpp | 18 +++-
 llvm/lib/Remarks/RemarkLinker.cpp | 14 +++--
 llvm/lib/Remarks/RemarkParser.cpp | 21 +--
 llvm/lib/Remarks/RemarkSerializer.cpp |  6 --
 .../Inputs/broken-remark-magic.bitstream  |  1 +
 .../llvm-remarkutil/annotation-count.test |  2 ++
 .../broken-bitstream-remark-magic.test|  6 ++
 .../tools/llvm-remarkutil/empty-file.test |  5 +
 .../llvm-remarkutil/instruction-count.test|  4 +++-
 .../llvm-remarkutil/instruction-mix.test  |  4 +++-
 .../size-diff/no-difference.test  |  3 +++
 .../tools/llvm-remarkutil/RemarkUtilHelpers.h |  9 +---
 llvm/unittests/Remarks/RemarksLinkingTest.cpp |  4 +---
 15 files changed, 75 insertions(+), 32 deletions(-)
 create mode 100644 
llvm/test/tools/llvm-remarkutil/Inputs/broken-remark-magic.bitstream
 create mode 100644 
llvm/test/tools/llvm-remarkutil/broken-bitstream-remark-magic.test

diff --git a/llvm/include/llvm/Remarks/RemarkFormat.h 
b/llvm/include/llvm/Remarks/RemarkFormat.h
index a39a013dcf905..eda201d4ee6f1 100644
--- a/llvm/include/llvm/Remarks/RemarkFormat.h
+++ b/llvm/include/llvm/Remarks/RemarkFormat.h
@@ -23,7 +23,7 @@ namespace remarks {
 constexpr StringLiteral Magic("REMARKS");
 
 /// The format used for serializing/deserializing remarks.
-enum class Format { Unknown, YAML, Bitstream };
+enum class Format { Unknown, Auto, YAML, Bitstream };
 
 /// Parse and validate a string for the remark format.
 LLVM_ABI Expected parseFormat(StringRef FormatStr);
@@ -31,6 +31,9 @@ LLVM_ABI Expected parseFormat(StringRef FormatStr);
 /// Parse and validate a magic number to a remark format.
 LLVM_ABI Expected magicToFormat(StringRef Magic);
 
+/// Detect format based on selected format and magic number
+LLVM_ABI Expected detectFormat(Format Selected, StringRef Magic);
+
 } // end namespace remarks
 } // end namespace llvm
 
diff --git a/llvm/include/llvm/Remarks/RemarkLinker.h 
b/llvm/include/llvm/Remarks/RemarkLinker.h
index 5343c62144708..67208f40592a5 100644
--- a/llvm/include/llvm/Remarks/RemarkLinker.h
+++ b/llvm/include/llvm/Remarks/RemarkLinker.h
@@ -80,13 +80,12 @@ struct RemarkLinker {
   /// \p Buffer.
   /// \p Buffer can be either a standalone remark container or just
   /// metadata. This takes care of uniquing and merging the remarks.
-  LLVM_ABI Error link(StringRef Buffer,
-  std::optional RemarkFormat = std::nullopt);
+  LLVM_ABI Error link(StringRef Buffer, Format RemarkFormat = Format::Auto);
 
   /// Link the remarks found in \p Obj by looking for the right section and
   /// calling the method above.
   LLVM_ABI Error link(const object::ObjectFile &Obj,
-  std::optional RemarkFormat = std::nullopt);
+  Format RemarkFormat = Format::Auto);
 
   /// Serialize the linked remarks to the stream \p OS, using the format \p
   /// RemarkFormat.
diff --git a/llvm/lib/Remarks/RemarkFormat.cpp 
b/llvm/lib/Remarks/RemarkFormat.cpp
index 800f5bffe70da..1c52e352f9392 100644
--- a/llvm/lib/Remarks/RemarkFormat.cpp
+++ b/llvm/lib/Remarks/RemarkFormat.cpp
@@ -42,6 +42,22 @@ Expected llvm::remarks::magicToFormat(StringRef 
MagicStr) {
 
   if (Result == Format::Unknown)
 return createStringError(std::make_error_code(std::errc::invalid_argument),
- "Unknown remark magic: '%s'", MagicStr.data());
+ "Automatic detection of remark format failed. "
+ "Unknown magic number: '%.4s'",
+ MagicStr.data());
   return Result;
 }
+
+Expected llvm::remarks::detectFormat(Format Selected,
+ StringRef MagicStr) {
+  if (Selected == Format::Unknown)
+return createStringError(std::make_error_code(std::errc::invalid_argument),
+ "Unknown remark parser format.");
+  if (Selected != Format::Auto)
+return Selected;
+
+  // Empty files are valid bitstream files
+  if (MagicStr.empty())
+return Format::Bitstream;
+  return magicToFormat(MagicStr);
+}
diff --git a/llvm/lib/Remarks/RemarkLinker.cpp 
b/llvm/lib/Remarks/RemarkLinker.cpp
index b8395aa135d82..0ca6217edfddd 100644
--- a/llvm/lib

[llvm-branch-commits] [llvm] [InlineCost] Simplify extractvalue across callsite (PR #145054)

2025-06-23 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler closed 
https://github.com/llvm/llvm-project/pull/145054
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [InlineCost] Simplify extractvalue across callsite (PR #145054)

2025-06-23 Thread Tobias Stadler via llvm-branch-commits

https://github.com/tobias-stadler reopened 
https://github.com/llvm/llvm-project/pull/145054
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits