[clang] [llvm] [AArch64] Correctly mark Neoverse N2 as an Armv9.0a core (PR #75055)
https://github.com/rgwott approved this pull request. Looking good. Is there a document to reference in the commit message? https://github.com/llvm/llvm-project/pull/75055 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AArch64] Assembly support for the Checked Pointer Arithmetic Extension (PR #73777)
https://github.com/rgwott approved this pull request. Approved. https://github.com/llvm/llvm-project/pull/73777 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [AArch64] Assembly support for the Armv9.5-A Memory System Extensions (PR #76237)
https://github.com/rgwott approved this pull request. Great patch, looks good to me. https://github.com/llvm/llvm-project/pull/76237 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AArch64] Remove Automatic Enablement of FEAT_F32MM (PR #85203)
@@ -2487,10 +2480,10 @@ AArch64ExtensionDependenciesBaseCPUTestParams {}}, {"cortex-a520", {}, - {"v9.2a","bf16", "crc", "dotprod", "f32mm", "flagm", - "fp-armv8", "fullfp16", "fp16fml", "i8mm","lse", "mte", - "pauth","perfmon", "predres", "ras", "rcpc", "rdm", - "sb", "neon", "ssbs","sve", "sve2-bitperm", "sve2"}, + {"v9.2a","bf16","crc", "dotprod", "flagm", "fp-armv8", rgwott wrote: Are these entries not better off one per line? The patch is modifying all lines anyway, might as well make it so that any future modifications would be more modular. https://github.com/llvm/llvm-project/pull/85203 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [Clang][AArch64] Add ACLE macros for FEAT_PAuth_LR (PR #80163)
https://github.com/rgwott approved this pull request. LGTM. https://github.com/llvm/llvm-project/pull/80163 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AArch64][SelectionDAG] Add CodeGen support for scalar FEAT_CPA (PR #105669)
@@ -1462,6 +1462,10 @@ enum NodeType { // Outputs: [rv], output chain, glue PATCHPOINT, + // PTRADD represents pointer arithmetic semantics, for those targets which + // benefit from that information. rgwott wrote: Done. https://github.com/llvm/llvm-project/pull/105669 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AArch64][SelectionDAG] Add CodeGen support for scalar FEAT_CPA (PR #105669)
@@ -401,7 +401,7 @@ def tblockaddress: SDNode<"ISD::TargetBlockAddress", SDTPtrLeaf, [], def add: SDNode<"ISD::ADD" , SDTIntBinOp , [SDNPCommutative, SDNPAssociative]>; -def ptradd : SDNode<"ISD::ADD" , SDTPtrAddOp, []>; +def ptradd : SDNode<"ISD::PTRADD", SDTPtrAddOp, []>; rgwott wrote: This should not be a problem because PTRADD falls back to ADD in any target that does not opt in to shouldPreservedPtrArith(). https://github.com/llvm/llvm-project/pull/105669 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AArch64][SelectionDAG] Add CodeGen support for scalar FEAT_CPA (PR #105669)
@@ -0,0 +1,451 @@ +; RUN: llc -mtriple=aarch64 -verify-machineinstrs --mattr=+cpa -O0 -global-isel=0 -fast-isel=0 %s -o - 2>&1 | FileCheck %s --check-prefixes=CHECK-CPA-O0 rgwott wrote: I have put codegen behind -mcpa-codegen. Let me know if you prefer some other flag name. https://github.com/llvm/llvm-project/pull/105669 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AArch64][SelectionDAG] Add CodeGen support for scalar FEAT_CPA (PR #105669)
https://github.com/rgwott updated https://github.com/llvm/llvm-project/pull/105669 >From b26f61578c033eb6be0484e7346970176caafce2 Mon Sep 17 00:00:00 2001 From: Rodolfo Wottrich Date: Thu, 22 Aug 2024 15:17:00 +0100 Subject: [PATCH 01/13] [AArch64] Add CodeGen support for FEAT_CPA CPA stands for Checked Pointer Arithmetic and is part of the 2023 MTE architecture extensions for A-profile. The new CPA instructions perform regular pointer arithmetic (such as base register + offset) but check for overflow in the most significant bits of the result, enhancing security by detecting address tampering. In this patch we intend to capture the semantics of pointer arithmetic when it is not folded into loads/stores, then generate the appropriate CPA instructions. In order to preserve pointer arithmetic semantics through the backend, we add the PTRADD SelectionDAG node type. The PTRADD node and respective visitPTRADD() function are adapted from the CHERI/Morello LLVM tree. Mode details about the CPA extension can be found at: - https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-developments-2023 - https://developer.arm.com/documentation/ddi0602/2023-09/ This PR follows #79569. --- llvm/include/llvm/CodeGen/ISDOpcodes.h| 4 + llvm/include/llvm/Target/TargetMachine.h | 5 + .../include/llvm/Target/TargetSelectionDAG.td | 4 +- llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | 101 +++- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp | 21 +- .../lib/CodeGen/SelectionDAG/SelectionDAG.cpp | 10 +- .../SelectionDAG/SelectionDAGBuilder.cpp | 44 +- .../SelectionDAG/SelectionDAGDumper.cpp | 1 + llvm/lib/Target/AArch64/AArch64InstrInfo.td | 20 + .../Target/AArch64/AArch64TargetMachine.cpp | 4 + .../lib/Target/AArch64/AArch64TargetMachine.h | 4 + .../GISel/AArch64InstructionSelector.cpp | 4 + llvm/test/CodeGen/AArch64/cpa-globalisel.ll | 455 ++ llvm/test/CodeGen/AArch64/cpa-selectiondag.ll | 449 + 14 files changed, 1104 insertions(+), 22 deletions(-) create mode 100644 llvm/test/CodeGen/AArch64/cpa-globalisel.ll create mode 100644 llvm/test/CodeGen/AArch64/cpa-selectiondag.ll diff --git a/llvm/include/llvm/CodeGen/ISDOpcodes.h b/llvm/include/llvm/CodeGen/ISDOpcodes.h index 86ff2628975942b..305b3349302 100644 --- a/llvm/include/llvm/CodeGen/ISDOpcodes.h +++ b/llvm/include/llvm/CodeGen/ISDOpcodes.h @@ -1452,6 +1452,10 @@ enum NodeType { // Outputs: [rv], output chain, glue PATCHPOINT, + // PTRADD represents pointer arithmetic semantics, for those targets which + // benefit from that information. + PTRADD, + // Vector Predication #define BEGIN_REGISTER_VP_SDNODE(VPSDID, ...) VPSDID, #include "llvm/IR/VPIntrinsics.def" diff --git a/llvm/include/llvm/Target/TargetMachine.h b/llvm/include/llvm/Target/TargetMachine.h index c3e9d41315f617d..26425fced525288 100644 --- a/llvm/include/llvm/Target/TargetMachine.h +++ b/llvm/include/llvm/Target/TargetMachine.h @@ -434,6 +434,11 @@ class TargetMachine { function_ref MPart)> ModuleCallback) { return false; } + + /// True if target has some particular form of dealing with pointer arithmetic + /// semantics. False if pointer arithmetic should not be preserved for passes + /// such as instruction selection, and can fallback to regular arithmetic. + virtual bool shouldPreservePtrArith(const Function &F) const { return false; } }; /// This class describes a target machine that is implemented with the LLVM diff --git a/llvm/include/llvm/Target/TargetSelectionDAG.td b/llvm/include/llvm/Target/TargetSelectionDAG.td index 172deffbd31771e..aeb27ccf921a4b0 100644 --- a/llvm/include/llvm/Target/TargetSelectionDAG.td +++ b/llvm/include/llvm/Target/TargetSelectionDAG.td @@ -109,7 +109,7 @@ def SDTOther : SDTypeProfile<1, 0, [SDTCisVT<0, OtherVT>]>; // for 'vt'. def SDTUNDEF : SDTypeProfile<1, 0, []>; // for 'undef'. def SDTUnaryOp : SDTypeProfile<1, 1, []>; // for bitconvert. -def SDTPtrAddOp : SDTypeProfile<1, 2, [ // ptradd +def SDTPtrAddOp : SDTypeProfile<1, 2, [ // ptradd SDTCisSameAs<0, 1>, SDTCisInt<2>, SDTCisPtrTy<1> ]>; def SDTIntBinOp : SDTypeProfile<1, 2, [ // add, and, or, xor, udiv, etc. @@ -390,7 +390,7 @@ def tblockaddress: SDNode<"ISD::TargetBlockAddress", SDTPtrLeaf, [], def add: SDNode<"ISD::ADD" , SDTIntBinOp , [SDNPCommutative, SDNPAssociative]>; -def ptradd : SDNode<"ISD::ADD" , SDTPtrAddOp, []>; +def ptradd : SDNode<"ISD::PTRADD", SDTPtrAddOp, []>; def sub: SDNode<"ISD::SUB" , SDTIntBinOp>; def mul: SDNode<"ISD::MUL" , SDTIntBinOp, [SDNPCommutative, SDNPAssociative]>; diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp index 11935cbc309f017..16a1
[clang] [llvm] [AArch64][SelectionDAG] Add CodeGen support for scalar FEAT_CPA (PR #105669)
@@ -0,0 +1,451 @@ +; RUN: llc -mtriple=aarch64 -verify-machineinstrs --mattr=+cpa -O0 -global-isel=0 -fast-isel=0 %s -o - 2>&1 | FileCheck %s --check-prefixes=CHECK-CPA-O0 +; RUN: llc -mtriple=aarch64 -verify-machineinstrs --mattr=+cpa -O3 -global-isel=0 -fast-isel=0 %s -o - 2>&1 | FileCheck %s --check-prefixes=CHECK-CPA-O3 rgwott wrote: It is interesting to have because sometimes the instruction selected is different, allowing us to exercise both. This did however get a little bit less useful with the removal of a SelectionDAG fold opportunity during the course of this PR. But I am in favour of keeping it. https://github.com/llvm/llvm-project/pull/105669 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AArch64][SelectionDAG] Add CodeGen support for scalar FEAT_CPA (PR #105669)
@@ -0,0 +1,451 @@ +; RUN: llc -mtriple=aarch64 -verify-machineinstrs --mattr=+cpa -O0 -global-isel=0 -fast-isel=0 %s -o - 2>&1 | FileCheck %s --check-prefixes=CHECK-CPA-O0 +; RUN: llc -mtriple=aarch64 -verify-machineinstrs --mattr=+cpa -O3 -global-isel=0 -fast-isel=0 %s -o - 2>&1 | FileCheck %s --check-prefixes=CHECK-CPA-O3 +; RUN: llc -mtriple=aarch64 -verify-machineinstrs --mattr=-cpa -O0 -global-isel=0 -fast-isel=0 %s -o - 2>&1 | FileCheck %s --check-prefixes=CHECK-NOCPA-O0 +; RUN: llc -mtriple=aarch64 -verify-machineinstrs --mattr=-cpa -O3 -global-isel=0 -fast-isel=0 %s -o - 2>&1 | FileCheck %s --check-prefixes=CHECK-NOCPA-O3 + +%struct.my_type = type { i64, i64 } +%struct.my_type2 = type { i64, i64, i64, i64, i64, i64 } + +@array = external dso_local global [10 x %struct.my_type], align 8 +@array2 = external dso_local global [10 x %struct.my_type2], align 8 + +define void @addpt1(i64 %index, i64 %arg) { +; CHECK-CPA-O0-LABEL:addpt1: +; CHECK-CPA-O0: addpt [[REG1:x[0-9]+]], x{{[0-9]+}}, x{{[0-9]+}}, lsl #4 +; CHECK-CPA-O0: str x{{[0-9]+}}, [[[REG1]], #8] +; +; CHECK-CPA-O3-LABEL:addpt1: +; CHECK-CPA-O3: addpt [[REG1:x[0-9]+]], x{{[0-9]+}}, x{{[0-9]+}}, lsl #4 +; CHECK-CPA-O3: str x{{[0-9]+}}, [[[REG1]], #8] +; +; CHECK-NOCPA-O0-LABEL: addpt1: +; CHECK-NOCPA-O0:add [[REG1:x[0-9]+]], x{{[0-9]+}}, x{{[0-9]+}}, lsl #4 +; CHECK-NOCPA-O0:str x{{[0-9]+}}, [[[REG1]], #8] + +; CHECK-NOCPA-O3-LABEL: addpt1: +; CHECK-NOCPA-O3:add [[REG1:x[0-9]+]], x{{[0-9]+}}, x{{[0-9]+}}, lsl #4 +; CHECK-NOCPA-O3:str x{{[0-9]+}}, [[[REG1]], #8] +entry: + %e2 = getelementptr inbounds %struct.my_type, ptr @array, i64 %index, i32 1 + store i64 %arg, ptr %e2, align 8 + ret void +} + +define void @maddpt1(i32 %pos, ptr %val) { +; CHECK-CPA-O0-LABEL:maddpt1: +; CHECK-CPA-O0: maddpt[[REG1:x[0-9]+]], x{{[0-9]+}}, x{{[0-9]+}}, x{{[0-9]+}} +; CHECK-CPA-O0: str q{{[0-9]+}}, [[[REG1]], #32] +; CHECK-CPA-O0: str q{{[0-9]+}}, [[[REG1]], #16] +; CHECK-CPA-O0: str q{{[0-9]+}}, [[[REG1]]] +; +; CHECK-CPA-O3-LABEL:maddpt1: +; CHECK-CPA-O3: maddpt[[REG1:x[0-9]+]], x{{[0-9]+}}, x{{[0-9]+}}, x{{[0-9]+}} +; CHECK-CPA-O3: stp q{{[0-9]+}}, q{{[0-9]+}}, [[[REG1]], #16] +; CHECK-CPA-O3: str q{{[0-9]+}}, [[[REG1]]] +; +; CHECK-NOCPA-O0-LABEL: maddpt1: +; CHECK-NOCPA-O0:smaddl[[REG1:x[0-9]+]], w{{[0-9]+}}, w{{[0-9]+}}, x{{[0-9]+}} +; CHECK-NOCPA-O0:str q{{[0-9]+}}, [[[REG1]], #32] +; CHECK-NOCPA-O0:str q{{[0-9]+}}, [[[REG1]], #16] +; CHECK-NOCPA-O0:str q{{[0-9]+}}, [[[REG1]]] +; +; CHECK-NOCPA-O3-LABEL: maddpt1: +; CHECK-NOCPA-O3:smaddl[[REG1:x[0-9]+]], w{{[0-9]+}}, w{{[0-9]+}}, x{{[0-9]+}} +; CHECK-NOCPA-O3:stp q{{[0-9]+}}, q{{[0-9]+}}, [[[REG1]], #16] +; CHECK-NOCPA-O3:str q{{[0-9]+}}, [[[REG1]]] +entry: + %idxprom = sext i32 %pos to i64 + %arrayidx = getelementptr inbounds [10 x %struct.my_type2], ptr @array2, i64 0, i64 %idxprom + tail call void @llvm.memcpy.p0.p0.i64(ptr align 8 dereferenceable(48) %arrayidx, ptr align 8 dereferenceable(48) %val, i64 48, i1 false) + ret void +} + +define void @msubpt1(i32 %index, i32 %elem) { +; CHECK-CPA-O0-LABEL:msubpt1: +; CHECK-CPA-O0: addpt [[REG1:x[0-9]+]], x{{[0-9]+}}, [[REG1]] +; CHECK-CPA-O0: addpt [[REG2:x[0-9]+]], x{{[0-9]+}}, [[REG2]] +; CHECK-CPA-O0: ldr q{{[0-9]+}}, [[[REG2]], #16] +; CHECK-CPA-O0: ldr q{{[0-9]+}}, [[[REG2]], #32] +; CHECK-CPA-O0: str q{{[0-9]+}}, [[[REG1]], #32] +; CHECK-CPA-O0: str q{{[0-9]+}}, [[[REG1]], #16] +; CHECK-CPA-O0: str q{{[0-9]+}}, [[[REG1]]] +; +; CHECK-CPA-O3-LABEL:msubpt1: +; CHECK-CPA-O3: addpt [[REG1:x[0-9]+]], x{{[0-9]+}}, [[REG1]] +; CHECK-CPA-O3: ldp q{{[0-9]+}}, q{{[0-9]+}}, [[[REG1]], #16] +; CHECK-CPA-O3: addpt [[REG2:x[0-9]+]], x{{[0-9]+}}, [[REG2]] +; CHECK-CPA-O3: stp q{{[0-9]+}}, q{{[0-9]+}}, [[[REG2]], #16] +; CHECK-CPA-O3: str q{{[0-9]+}}, [[[REG2]]] +; +; CHECK-NOCPA-O0-LABEL: msubpt1: +; CHECK-NOCPA-O0:mneg [[REG1:x[0-9]+]], x{{[0-9]+}}, x{{[0-9]+}} +; CHECK-NOCPA-O0:add [[REG2:x[0-9]+]], x{{[0-9]+}}, [[REG1]] +; CHECK-NOCPA-O0:str q{{[0-9]+}}, [[[REG2]], #320] +; CHECK-NOCPA-O0:str q{{[0-9]+}}, [[[REG2]], #304] +; CHECK-NOCPA-O0:str q{{[0-9]+}}, [[[REG2]], #288] +; +; CHECK-NOCPA-O3-LABEL: msubpt1: +; CHECK-NOCPA-O3:mneg [[REG1:x[0-9]+]], x{{[0-9]+}}, x{{[0-9]+}} +; CHECK-NOCPA-O3:add [[REG2:x[0-9]+]], x{{[0-9]+}}, [[REG1]] +; CHECK-NOCPA-O3:stp q{{[0-9]+}}, q{{[0-9]+}}, [[[REG2]], #304] +; CHECK-NOCPA-O3:str q{{[0-9]+}}, [[[REG2]], #288] +entry: + %idx.ext = sext i32 %index to i64 + %idx.neg = sub nsw i64 0, %idx.ext + %add.
[clang] [llvm] [AArch64][SelectionDAG] Add CodeGen support for scalar FEAT_CPA (PR #105669)
@@ -0,0 +1,451 @@ +; RUN: llc -mtriple=aarch64 -verify-machineinstrs --mattr=+cpa -O0 -global-isel=0 -fast-isel=0 %s -o - 2>&1 | FileCheck %s --check-prefixes=CHECK-CPA-O0 +; RUN: llc -mtriple=aarch64 -verify-machineinstrs --mattr=+cpa -O3 -global-isel=0 -fast-isel=0 %s -o - 2>&1 | FileCheck %s --check-prefixes=CHECK-CPA-O3 +; RUN: llc -mtriple=aarch64 -verify-machineinstrs --mattr=-cpa -O0 -global-isel=0 -fast-isel=0 %s -o - 2>&1 | FileCheck %s --check-prefixes=CHECK-NOCPA-O0 +; RUN: llc -mtriple=aarch64 -verify-machineinstrs --mattr=-cpa -O3 -global-isel=0 -fast-isel=0 %s -o - 2>&1 | FileCheck %s --check-prefixes=CHECK-NOCPA-O3 + +%struct.my_type = type { i64, i64 } +%struct.my_type2 = type { i64, i64, i64, i64, i64, i64 } + +@array = external dso_local global [10 x %struct.my_type], align 8 +@array2 = external dso_local global [10 x %struct.my_type2], align 8 + +define void @addpt1(i64 %index, i64 %arg) { +; CHECK-CPA-O0-LABEL:addpt1: rgwott wrote: I am not sure why, could you explain your rationale? https://github.com/llvm/llvm-project/pull/105669 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AArch64][SelectionDAG] Add CodeGen support for scalar FEAT_CPA (PR #105669)
@@ -10382,6 +10382,26 @@ let Predicates = [HasCPA] in { // Scalar multiply-add/subtract def MADDPT : MulAccumCPA<0, "maddpt">; def MSUBPT : MulAccumCPA<1, "msubpt">; + + // Rules to use CPA instructions in pointer arithmetic patterns which are not + // folded into loads/stores. The AddedComplexity serves to help supersede + // other simpler (non-CPA) patterns and make sure CPA is used instead. + let AddedComplexity = 20 in { rgwott wrote: That is a remainder from the previous PR, in which SelectionDAG did not differentiate between ptr and integer arithmetic (which was wrong), and this made instr selection "correct". I removed it now, well spotted. https://github.com/llvm/llvm-project/pull/105669 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AArch64][SelectionDAG] Add CodeGen support for scalar FEAT_CPA (PR #105669)
https://github.com/rgwott edited https://github.com/llvm/llvm-project/pull/105669 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AArch64][SelectionDAG] Add CodeGen support for scalar FEAT_CPA (PR #105669)
@@ -401,7 +401,7 @@ def tblockaddress: SDNode<"ISD::TargetBlockAddress", SDTPtrLeaf, [], def add: SDNode<"ISD::ADD" , SDTIntBinOp , [SDNPCommutative, SDNPAssociative]>; -def ptradd : SDNode<"ISD::ADD" , SDTPtrAddOp, []>; +def ptradd : SDNode<"ISD::PTRADD", SDTPtrAddOp, []>; rgwott wrote: I investigated a bit about this AMDGPU occurrence. When building the compiler, the tablegen will indeed compile to include a ptradd pattern to select a certain instruction. However: (1) there are two consecutive entries, one for add and one for ptradd, that do the same thing, so they are equivalent, and (2) in compile time, the PTRADD node will never exist for that backend so the tablegen entry will never be selected (the one for ADD will instead). Thankfully this is not a problem. https://github.com/llvm/llvm-project/pull/105669 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AArch64][SelectionDAG] Add CodeGen support for scalar FEAT_CPA (PR #105669)
rgwott wrote: @arichardson Yes, of course. It's just that this is low priority at work right now. I intend to come back to it in some days. https://github.com/llvm/llvm-project/pull/105669 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits