[llvm-branch-commits] [llvm] [Support] Integrate SipHash.cpp into libSupport. (PR #94394)

2024-06-05 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited https://github.com/llvm/llvm-project/pull/94394
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Support] Integrate SipHash.cpp into libSupport. (PR #94394)

2024-06-05 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls requested changes to this pull request.

Do I understand correctly that "remove big-endian support" results in this code 
not running correctly on big-endian machines?
I don't recall the LLVM project claiming that it cannot run on big-endian 
machines.
If I understand this correctly, I would not remove the big-endian support.
(I think it also helps with keeping the source code closer to the upstream 
siphash version, which might have some benefits?)

https://github.com/llvm/llvm-project/pull/94394
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Support] Integrate SipHash.cpp into libSupport. (PR #94394)

2024-06-05 Thread Kristof Beyls via llvm-branch-commits


@@ -58,21 +38,15 @@
 v2 = ROTL(v2, 32); 
\
   } while (0)
 
-/*

kbeyls wrote:

It might be worthwhile to keep (an edited version of) this comment? It wasn't 
immediately obvious to me what the value is of removing it.

https://github.com/llvm/llvm-project/pull/94394
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Support] Integrate SipHash.cpp into libSupport. (PR #94394)

2024-06-05 Thread Kristof Beyls via llvm-branch-commits


@@ -58,21 +38,15 @@
 v2 = ROTL(v2, 32); 
\
   } while (0)
 
-/*
-Computes a SipHash value
-*in: pointer to input data (read-only)
-inlen: input data length in bytes (any size_t value)
-*k: pointer to the key data (read-only), must be 16 bytes
-*out: pointer to output data (write-only), outlen bytes must be allocated
-outlen: length of the output in bytes, must be 8 or 16
-*/
-int siphash(const void *in, const size_t inlen, const void *k, uint8_t *out,
-const size_t outlen) {
+template 
+static inline ResultTy siphash(const unsigned char *in, uint64_t inlen,

kbeyls wrote:

Rather than using "static inline", wouldn't putting the `siphash` function in 
an anonymous namespace be the more C++-way of not making this function 
available to other translation units?
I'm also not fully sure if the `inline` keyword makes a meaningful difference?
(I'm not an expert on this, just wondering)

https://github.com/llvm/llvm-project/pull/94394
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Support] Integrate SipHash.cpp into libSupport. (PR #94394)

2024-06-05 Thread Kristof Beyls via llvm-branch-commits


@@ -58,21 +38,15 @@
 v2 = ROTL(v2, 32); 
\
   } while (0)
 
-/*
-Computes a SipHash value
-*in: pointer to input data (read-only)
-inlen: input data length in bytes (any size_t value)
-*k: pointer to the key data (read-only), must be 16 bytes
-*out: pointer to output data (write-only), outlen bytes must be allocated
-outlen: length of the output in bytes, must be 8 or 16
-*/
-int siphash(const void *in, const size_t inlen, const void *k, uint8_t *out,
-const size_t outlen) {
+template 
+static inline ResultTy siphash(const unsigned char *in, uint64_t inlen,
+   const unsigned char (&k)[16]) {
 
   const unsigned char *ni = (const unsigned char *)in;
   const unsigned char *kk = (const unsigned char *)k;
 
-  assert((outlen == 8) || (outlen == 16));

kbeyls wrote:

I guess that adding
```
const size_t outlen = sizeof(ResultTy);
```
above this might result in fewer local modification needed in the rest of this 
function implementation?
Not sure if it's worth it. It might make it somewhat easier to update the 
implementation to a later "upstream" version if that would ever be needed. But 
I guess that's it's unlikely we'd ever need to update the implementation to a 
later upstream version

https://github.com/llvm/llvm-project/pull/94394
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Support] Integrate SipHash.cpp into libSupport. (PR #94394)

2024-06-05 Thread Kristof Beyls via llvm-branch-commits


@@ -1,25 +1,19 @@
-/*
-   SipHash reference C implementation
-
-   Copyright (c) 2012-2022 Jean-Philippe Aumasson
-   
-   Copyright (c) 2012-2014 Daniel J. Bernstein 
-
-   To the extent possible under law, the author(s) have dedicated all copyright
-   and related and neighboring rights to this software to the public domain
-   worldwide. This software is distributed without any warranty.
-
-   You should have received a copy of the CC0 Public Domain Dedication along
-   with
-   this software. If not, see
-   .
- */
+//===--- SipHash.cpp - An ABI-stable string hash 
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
 
 #include "siphash.h"
 #include 
 #include 
 #include 
 
+// Lightly adapted from the SipHash reference C implementation by
+// Jean-Philippe Aumasson and Daniel J. Bernstein.

kbeyls wrote:

It might be worthwhile to also add the link to the upstream repo in this 
comment: https://github.com/veorq/SipHash?

https://github.com/llvm/llvm-project/pull/94394
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Support] Integrate SipHash.cpp into libSupport. (PR #94394)

2024-06-05 Thread Kristof Beyls via llvm-branch-commits

kbeyls wrote:

> I'll also mention that I left the original variable naming but did re-format 
> the file, whitespace being more friendly to diffs, and this being nice and 
> tidily contained. If you or others have strong opinions, I'm happy to 
> recapitalize the names.

I don't have a strong opinion; I can see the argument both ways. Happy with it 
as is.

https://github.com/llvm/llvm-project/pull/94394
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Support] Integrate SipHash.cpp into libSupport. (PR #94394)

2024-06-08 Thread Kristof Beyls via llvm-branch-commits

kbeyls wrote:

> So, regarding big-endian things. Original siphash is always "little-endian" 
> regardless of the host platform. On big endian hosts it essentially does byte 
> swap in the end. We do not have it here, so we will end with different hashes 
> on platforms with different endianness.
> 
> From the pauth perspective this is not a problem, as we do not do 
> cross-platform hash calculation and further comparison. The hash output 
> (discriminator value) is always compiled on compiler side and left as-is.
> 
> So, we can either keep the present code as-is. Or we can just sprinkle few 
> calls from `Endian.h` to do byteswap on BE platforms.

I was thinking that it is important that on both big and little endian 
platforms, the same hash is produced? Otherwise it becomes impossible to 
cross-compile from an other-endian host to an other-endian target? That 
basically would break ABI?
It would surface when combining libraries built on differently-endian 
platforms. Maybe this doesn't happen often in practice, but LLVM remains 
supported on big-endian platforms, so I would think it's important that those 
platforms can cross-compile correctly to other targets?

https://github.com/llvm/llvm-project/pull/94394
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Support] Integrate SipHash.cpp into libSupport. (PR #94394)

2024-06-12 Thread Kristof Beyls via llvm-branch-commits

kbeyls wrote:

I just checked if there indeed are big-endian bots which should pick up if a 
different hash gets produced on a big-endian system. I guess this bot (the only 
bot?) would pick it up: https://lab.llvm.org/buildbot/#/builders/231

I now also realize that there are no tests with this commit. I assume that 
later commits that test hash computation for a pointer authentication 
discriminator will implicitly test this. In the ideal world, it seems there 
should be a simple test, e.g. checking one 64 bit and one 128 bit hash as part 
of this commit?
I don't think not having the test should block landing this PR, but if it would 
be straightforward to add a test, then I think it is still worthwhile to do it 
as part of this PR.

https://github.com/llvm/llvm-project/pull/94394
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Support] Integrate SipHash.cpp into libSupport. (PR #94394)

2024-06-12 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls approved this pull request.


https://github.com/llvm/llvm-project/pull/94394
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Support] Integrate SipHash.cpp into libSupport. (PR #94394)

2024-06-13 Thread Kristof Beyls via llvm-branch-commits

kbeyls wrote:

> Yes, this doesn't have tests by itself because there's no exposed interface. 
> It's certainly trivial to add one (which would allow using the reference test 
> vectors).
> 
> I don't have strong arguments either way, but I figured the conservative 
> option is to force hypothetical users to consider their use more seriously. 
> One might argue that's not how we usually treat libSupport, so I'm happy to 
> expose the raw function here.

I see some value in being able to test with the reference test vectors to be 
fully sure that the implementation really implements SipHash. But as I said 
above, I'm happy with merging this as is.

https://github.com/llvm/llvm-project/pull/94394
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Support] Integrate SipHash.cpp into libSupport. (PR #94394)

2024-06-14 Thread Kristof Beyls via llvm-branch-commits

kbeyls wrote:

> [37c84b9](https://github.com/llvm/llvm-project/pull/94394/commits/37c84b9dce70f40db8a7c27b7de8232c4d10f78f)
>  shows what I had in mind, let me know what you all think. I added:
> 
> ```
> void getSipHash_2_4_64(ArrayRef In, const uint8_t (&K)[16],
>uint8_t (&Out)[8]);
> 
> void getSipHash_2_4_128(ArrayRef In, const uint8_t (&K)[16],
> uint8_t (&Out)[16]);
> ```
> 
> as the core interfaces, and mimicked the ref. test harness to reuse the same 
> test vectors. If this seems reasonable to yall I'm happy to extract the 
> vectors.h file from the ref. implementation into the "Import original 
> sources" PR – that's why I kept it open ;)

Thanks, that looks good to me.

https://github.com/llvm/llvm-project/pull/94394
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [Support] Integrate SipHash.cpp into libSupport. (PR #94394)

2024-06-14 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls approved this pull request.


https://github.com/llvm/llvm-project/pull/94394
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64] Only create called thunks when hardening against SLS (PR #97472)

2024-07-03 Thread Kristof Beyls via llvm-branch-commits


@@ -36,38 +32,43 @@ using namespace llvm;
 
 #define AARCH64_SLS_HARDENING_NAME "AArch64 sls hardening pass"
 
+static const char SLSBLRNamePrefix[] = "__llvm_slsblr_thunk_";
+
 namespace {
 
-class AArch64SLSHardening : public MachineFunctionPass {
-public:
-  const TargetInstrInfo *TII;
-  const TargetRegisterInfo *TRI;
-  const AArch64Subtarget *ST;
+// Set of inserted thunks: bitmask with bits corresponding to
+// indexes in SLSBLRThunks array.
+typedef uint32_t ThunksSet;

kbeyls wrote:

I guess once support for BLRA* instruction will be added, the bitset may 
contain up to 29\*29-ish registers and this typedef will change to something 
different than a `uint32_t`?
I agree this can be done in the follow-on patch that will support BLRA* 
instructions.

https://github.com/llvm/llvm-project/pull/97472
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64] Only create called thunks when hardening against SLS (PR #97472)

2024-07-03 Thread Kristof Beyls via llvm-branch-commits


@@ -46,13 +40,5 @@ body: |
 
 
 ...

-name:__llvm_slsblr_thunk_x8
-tracksRegLiveness: true
-body: |
-  bb.0.entry:
-liveins: $x8
 
-BR $x8

kbeyls wrote:

I think it is important to check that the body of the thunk inserted is as 
expected.
What is the motivation to remove that check, and only check that a thunk with 
the expected name was inserted, without checking its body?

https://github.com/llvm/llvm-project/pull/97472
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64] Only create called thunks when hardening against SLS (PR #97472)

2024-07-03 Thread Kristof Beyls via llvm-branch-commits


@@ -36,38 +32,43 @@ using namespace llvm;
 
 #define AARCH64_SLS_HARDENING_NAME "AArch64 sls hardening pass"
 
+static const char SLSBLRNamePrefix[] = "__llvm_slsblr_thunk_";
+
 namespace {
 
-class AArch64SLSHardening : public MachineFunctionPass {
-public:
-  const TargetInstrInfo *TII;
-  const TargetRegisterInfo *TRI;
-  const AArch64Subtarget *ST;
+// Set of inserted thunks: bitmask with bits corresponding to
+// indexes in SLSBLRThunks array.
+typedef uint32_t ThunksSet;
 
-  static char ID;
-
-  AArch64SLSHardening() : MachineFunctionPass(ID) {
-initializeAArch64SLSHardeningPass(*PassRegistry::getPassRegistry());
+struct SLSBLRThunkInserter : ThunkInserter {

kbeyls wrote:

I'm not sure if `SLSBLRThunkInserter` is the most appropriate name for this 
class.
IIUC, this class will also harden Returns and BRs, (see method 
`hardenReturnsAndBRs`), which does not insert thunks.
If I understand this correctly, I think sticking with the original name of 
"AArch64SLSHardening" describes better what this class does?

https://github.com/llvm/llvm-project/pull/97472
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-05 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls commented:

Thank you, this mostly looks good to me.
I've only added very minor comments; feel free to disagree with them.

https://github.com/llvm/llvm-project/pull/97605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-05 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited https://github.com/llvm/llvm-project/pull/97605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-05 Thread Kristof Beyls via llvm-branch-commits


@@ -68,6 +156,57 @@ struct SLSHardeningInserter : 
ThunkInserter {
 
 } // end anonymous namespace
 
+const ThunkKind ThunkKind::BR = {ThunkBR, "", false, false, AArch64::BR};
+const ThunkKind ThunkKind::BRAA = {ThunkBRAA, "aa_", true, true, 
AArch64::BRAA};
+const ThunkKind ThunkKind::BRAB = {ThunkBRAB, "ab_", true, true, 
AArch64::BRAB};
+const ThunkKind ThunkKind::BRAAZ = {ThunkBRAAZ, "aaz_", false, true,
+AArch64::BRAAZ};
+const ThunkKind ThunkKind::BRABZ = {ThunkBRABZ, "abz_", false, true,
+AArch64::BRABZ};

kbeyls wrote:

very minor nitpick: These probably can go directly after the definition of 
`struct ThunkKind`?
If so, it's easier to understand what all the "false" and "true"s mean here 
without having to scroll a lot to where `struct ThunkKind` is defined.

https://github.com/llvm/llvm-project/pull/97605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-05 Thread Kristof Beyls via llvm-branch-commits


@@ -0,0 +1,210 @@
+# RUN: llc -verify-machineinstrs -mtriple=aarch64-none-linux-gnu \
+# RUN: -start-before aarch64-sls-hardening -o - %s \
+# RUN: -asm-verbose=0 \
+# RUN: | FileCheck %s \
+# RUN: --implicit-check-not=__llvm_slsblr_thunk_aa_x5_x8 \
+# RUN: --implicit-check-not=__llvm_slsblr_thunk_ab_x5_x8 \
+# RUN: --implicit-check-not=__llvm_slsblr_thunk_aaz_x5 \
+# RUN: --implicit-check-not=__llvm_slsblr_thunk_abz_x5
+
+# Pointer Authentication extension introduces more branch-with-link 
instructions

kbeyls wrote:

Maybe say "branch-with-link-to-register" as that's the name the ArmARM (Arm 
Architecture reference manual) uses for BLR* instructions. "branch-with-link" 
to me refers to the BL instruction.

https://github.com/llvm/llvm-project/pull/97605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-05 Thread Kristof Beyls via llvm-branch-commits


@@ -221,13 +339,19 @@ void SLSHardeningInserter::populateThunk(MachineFunction 
&MF) {
   //  __llvm_slsblr_thunk_xN:
   //  BR xN
   //  barrierInsts

kbeyls wrote:

I think it would be useful to update this comment to make it clear what the 
different kinds of thunk bodies are that are intended to be created here.

https://github.com/llvm/llvm-project/pull/97605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-05 Thread Kristof Beyls via llvm-branch-commits


@@ -221,13 +339,19 @@ void SLSHardeningInserter::populateThunk(MachineFunction 
&MF) {
   //  __llvm_slsblr_thunk_xN:
   //  BR xN
   //  barrierInsts

kbeyls wrote:

As part of reviewing this, I was wondering why the actual thunk content is
```
  //  __llvm_slsblr_thunk_{aa|ab|aaz|abz|}_xN_{xM}:
  //  MOV X16, Xn
  //  BR X16 | BRA{A|B} X16, Xm | BRA{A|B}Z X16
  //  barrierInsts
```

I had to use git blame to remind myself of why I changed this about 4 years ago,
pointing to this commit: 
https://github.com/llvm/llvm-project/commit/d938ec4509c47d461377527fc2877ae14b91275c

I think it would be useful to add an explanation similar to the one on that 
commit message
to the comment here to explain why the `mov X16, Xn` is needed, as
it is non-trivial. The explanation on the original commit message is:
```
A "BTI c" instruction only allows jumping/calling to using a BLR* instruction.
However, the SLSBLR mitigation changes a BLR to a BR to implement the
function call. Therefore, a "BTI c" check that passed before could
trigger after the BLR->BR change done by the SLSBLR mitigation.
However, if the register used in BR is X16 or X17, this trigger will not
fire (see ArmARM for further details).
```


https://github.com/llvm/llvm-project/pull/97605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-05 Thread Kristof Beyls via llvm-branch-commits


@@ -221,13 +339,19 @@ void SLSHardeningInserter::populateThunk(MachineFunction 
&MF) {
   //  __llvm_slsblr_thunk_xN:
   //  BR xN
   //  barrierInsts

kbeyls wrote:

Maybe something like
```
  //  __llvm_slsblr_thunk_{aa|ab|aaz|abz|}_xN_{xM}:
  //  BR Xn | BRA{A|B} Xn, Xm | BRA{A|B}Z xN
  //  barrierInsts
```
?

https://github.com/llvm/llvm-project/pull/97605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-05 Thread Kristof Beyls via llvm-branch-commits


@@ -68,6 +156,57 @@ struct SLSHardeningInserter : 
ThunkInserter {
 
 } // end anonymous namespace
 
+const ThunkKind ThunkKind::BR = {ThunkBR, "", false, false, AArch64::BR};
+const ThunkKind ThunkKind::BRAA = {ThunkBRAA, "aa_", true, true, 
AArch64::BRAA};
+const ThunkKind ThunkKind::BRAB = {ThunkBRAB, "ab_", true, true, 
AArch64::BRAB};
+const ThunkKind ThunkKind::BRAAZ = {ThunkBRAAZ, "aaz_", false, true,
+AArch64::BRAAZ};
+const ThunkKind ThunkKind::BRABZ = {ThunkBRABZ, "abz_", false, true,
+AArch64::BRABZ};
+
+static const ThunkKind *getThunkKind(unsigned OriginalOpcode) {

kbeyls wrote:

very minor comment: I thought I'd just share that if I would've written this 
function, I'd probably use std::optional rather than a "pointer to" return 
value. But this is very nit-picky, so if you think "pointer to" is better, 
let's leave it as is.

https://github.com/llvm/llvm-project/pull/97605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-05 Thread Kristof Beyls via llvm-branch-commits


@@ -274,40 +398,31 @@ void SLSHardeningInserter::convertBLRToBL(
 
   MachineInstr &BLR = *MBBI;
   assert(isBLR(BLR));
-  unsigned BLOpcode;
-  Register Reg;
-  bool RegIsKilled;
-  switch (BLR.getOpcode()) {
-  case AArch64::BLR:
-  case AArch64::BLRNoIP:
-BLOpcode = AArch64::BL;
-Reg = BLR.getOperand(0).getReg();
-assert(Reg != AArch64::X16 && Reg != AArch64::X17 && Reg != AArch64::LR);
-RegIsKilled = BLR.getOperand(0).isKill();
-break;
-  case AArch64::BLRAA:
-  case AArch64::BLRAB:
-  case AArch64::BLRAAZ:
-  case AArch64::BLRABZ:
-llvm_unreachable("BLRA instructions cannot yet be produced by LLVM, "
- "therefore there is no need to support them for now.");
-  default:
-llvm_unreachable("unhandled BLR");
-  }
+  const ThunkKind &Kind = *getThunkKind(BLR.getOpcode());

kbeyls wrote:

In the 20-line-ish long comment above (which I cannot comment on directly in 
github),
I think it would also be useful to indicate which branch instructions might be 
the authenticating ones. I think that might help the reader a bit better with 
more quickly understanding what the intended transform is. E.g. something like:
```
// Before:
  //   |-|
  //   |  ...|
  //   |  instI  |
  //   |  BLR{A|B|}{Z} xN{, xM}  |
  //   |  instJ  |
  //   |  ...|
  //   |-|
  //
  // After:
  //   |-|
  //   |  ...|
  //   |  instI  |
  //   |  BL __llvm_slsblr_thunk_xN  |
  //   |  instJ  |
  //   |  ...|
  //   |-|
  //
  //   __llvm_slsblr_thunk_xN:
  //   |-|
  //   |  BR{A|B|}{Z} xN{, xM}   |
  //   |  barrierInsts   |
  //   |-|
//
```

https://github.com/llvm/llvm-project/pull/97605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-05 Thread Kristof Beyls via llvm-branch-commits


@@ -139,64 +262,59 @@ bool 
SLSHardeningInserter::hardenReturnsAndBRs(MachineModuleInfo &MMI,
   return Modified;
 }
 
-static const unsigned NumPermittedRegs = 29;
-static const struct ThunkNameAndReg {
-  const char* Name;
-  Register Reg;
-} SLSBLRThunks[NumPermittedRegs] = {
-{"__llvm_slsblr_thunk_x0", AArch64::X0},
-{"__llvm_slsblr_thunk_x1", AArch64::X1},
-{"__llvm_slsblr_thunk_x2", AArch64::X2},
-{"__llvm_slsblr_thunk_x3", AArch64::X3},
-{"__llvm_slsblr_thunk_x4", AArch64::X4},
-{"__llvm_slsblr_thunk_x5", AArch64::X5},
-{"__llvm_slsblr_thunk_x6", AArch64::X6},
-{"__llvm_slsblr_thunk_x7", AArch64::X7},
-{"__llvm_slsblr_thunk_x8", AArch64::X8},
-{"__llvm_slsblr_thunk_x9", AArch64::X9},
-{"__llvm_slsblr_thunk_x10", AArch64::X10},
-{"__llvm_slsblr_thunk_x11", AArch64::X11},
-{"__llvm_slsblr_thunk_x12", AArch64::X12},
-{"__llvm_slsblr_thunk_x13", AArch64::X13},
-{"__llvm_slsblr_thunk_x14", AArch64::X14},
-{"__llvm_slsblr_thunk_x15", AArch64::X15},
-// X16 and X17 are deliberately missing, as the mitigation requires those
-// register to not be used in BLR. See comment in ConvertBLRToBL for more
-// details.
-{"__llvm_slsblr_thunk_x18", AArch64::X18},
-{"__llvm_slsblr_thunk_x19", AArch64::X19},
-{"__llvm_slsblr_thunk_x20", AArch64::X20},
-{"__llvm_slsblr_thunk_x21", AArch64::X21},
-{"__llvm_slsblr_thunk_x22", AArch64::X22},
-{"__llvm_slsblr_thunk_x23", AArch64::X23},
-{"__llvm_slsblr_thunk_x24", AArch64::X24},
-{"__llvm_slsblr_thunk_x25", AArch64::X25},
-{"__llvm_slsblr_thunk_x26", AArch64::X26},
-{"__llvm_slsblr_thunk_x27", AArch64::X27},
-{"__llvm_slsblr_thunk_x28", AArch64::X28},
-{"__llvm_slsblr_thunk_x29", AArch64::FP},
-// X30 is deliberately missing, for similar reasons as X16 and X17 are
-// missing.
-{"__llvm_slsblr_thunk_x31", AArch64::XZR},
-};
+// Currently, the longest possible thunk name is
+//   __llvm_slsblr_thunk_aa_xNN_xMM
+// which is 31 characters (without the '\0' character).
+static SmallString<32> createThunkName(const ThunkKind &Kind, Register Xn,
+   Register Xm) {
+  unsigned N = ThunksSet::indexOfXReg(Xn);
+  if (!Kind.HasXmOperand)
+return formatv("{0}{1}x{2}", CommonNamePrefix, Kind.NameInfix, N);
+
+  unsigned M = ThunksSet::indexOfXReg(Xm);
+  return formatv("{0}{1}x{2}_x{3}", CommonNamePrefix, Kind.NameInfix, N, M);
+}
 
-unsigned getThunkIndex(Register Reg) {
-  for (unsigned I = 0; I < NumPermittedRegs; ++I)
-if (SLSBLRThunks[I].Reg == Reg)
-  return I;
-  llvm_unreachable("Unexpected register");
+static const ThunkKind &parseThunkName(StringRef ThunkName, Register &Xn,

kbeyls wrote:

Again a very minor nitpick: I was a bit confused initially that the 3 results 
of this function are returned split between function return value (ThunkKind) 
and 2 pass-by-reference function arguments (Xn, Xm).
Would it be better to either return all 3 of them through a function return 
value, or return all 3 of them through pass-by-reference function arguments?
I'm just sharing it in case you think it's worthwhile to make such a change.

https://github.com/llvm/llvm-project/pull/97605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64][PAC] Support BLRA* instructions in SLS Hardening pass (PR #97605)

2024-07-05 Thread Kristof Beyls via llvm-branch-commits


@@ -68,6 +156,57 @@ struct SLSHardeningInserter : 
ThunkInserter {
 
 } // end anonymous namespace
 
+const ThunkKind ThunkKind::BR = {ThunkBR, "", false, false, AArch64::BR};
+const ThunkKind ThunkKind::BRAA = {ThunkBRAA, "aa_", true, true, 
AArch64::BRAA};
+const ThunkKind ThunkKind::BRAB = {ThunkBRAB, "ab_", true, true, 
AArch64::BRAB};
+const ThunkKind ThunkKind::BRAAZ = {ThunkBRAAZ, "aaz_", false, true,
+AArch64::BRAAZ};
+const ThunkKind ThunkKind::BRABZ = {ThunkBRABZ, "abz_", false, true,
+AArch64::BRABZ};
+
+static const ThunkKind *getThunkKind(unsigned OriginalOpcode) {
+  switch (OriginalOpcode) {
+  case AArch64::BLR:
+  case AArch64::BLRNoIP:
+return &ThunkKind::BR;
+  case AArch64::BLRAA:
+return &ThunkKind::BRAA;
+  case AArch64::BLRAB:
+return &ThunkKind::BRAB;
+  case AArch64::BLRAAZ:
+return &ThunkKind::BRAAZ;
+  case AArch64::BLRABZ:
+return &ThunkKind::BRABZ;
+  }
+  return nullptr;
+}
+
+static bool isBLR(const MachineInstr &MI) {
+  return getThunkKind(MI.getOpcode()) != nullptr;
+}
+
+unsigned ThunksSet::indexOfXReg(Register Reg) {
+  assert(AArch64::GPR64RegClass.contains(Reg));
+  assert(Reg != AArch64::X16 && Reg != AArch64::X17 && Reg != AArch64::LR);
+
+  // Most Xn registers have consequent ids, except for FP and XZR.

kbeyls wrote:

s/consequent/consecutive/?

https://github.com/llvm/llvm-project/pull/97605
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] c816be2 - Add release note for aarch64-none-elf driver change.

2022-01-26 Thread Kristof Beyls via llvm-branch-commits

Author: Kristof Beyls
Date: 2022-01-26T09:13:22+01:00
New Revision: c816be2026af3641f9b648482c48dd1f18a73dd1

URL: 
https://github.com/llvm/llvm-project/commit/c816be2026af3641f9b648482c48dd1f18a73dd1
DIFF: 
https://github.com/llvm/llvm-project/commit/c816be2026af3641f9b648482c48dd1f18a73dd1.diff

LOG: Add release note for aarch64-none-elf driver change.

Added: 


Modified: 
clang/docs/ReleaseNotes.rst

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 2272d7197ac57..6a9b046a1427d 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -275,6 +275,9 @@ Arm and AArch64 Support in Clang
   architecture features, but will enable certain optimizations specific to
   Cortex-A57 CPUs and enable the use of a more accurate scheduling model.
 
+- The --aarch64-none-elf target now uses the BareMetal driver rather than the
+  GNU driver. Programs that depend on clang invoking GCC as the linker driver
+  should use GCC as the linker in the build system.
 
 Floating Point Support in Clang
 ---



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 195f442 - [ARM] Implement harden-sls-retbr for ARM mode

2020-12-19 Thread Kristof Beyls via llvm-branch-commits

Author: Kristof Beyls
Date: 2020-12-19T11:42:39Z
New Revision: 195f44278c4361a4a32377a98a1e3a15203d3647

URL: 
https://github.com/llvm/llvm-project/commit/195f44278c4361a4a32377a98a1e3a15203d3647
DIFF: 
https://github.com/llvm/llvm-project/commit/195f44278c4361a4a32377a98a1e3a15203d3647.diff

LOG: [ARM] Implement harden-sls-retbr for ARM mode

Some processors may speculatively execute the instructions immediately
following indirect control flow, such as returns, indirect jumps and
indirect function calls.

To avoid a potential miss-speculatively executed gadget after these
instructions leaking secrets through side channels, this pass places a
speculation barrier immediately after every indirect control flow where
control flow doesn't return to the next instruction, such as returns and
indirect jumps, but not indirect function calls.

Hardening of indirect function calls will be done in a later,
independent patch.

This patch is implementing the same functionality as the AArch64 counter
part implemented in https://reviews.llvm.org/D81400.
For AArch64, returns and indirect jumps only occur on RET and BR
instructions and hence the function attribute to control the hardening
is called "harden-sls-retbr" there. On AArch32, there is a much wider
variety of instructions that can trigger an indirect unconditional
control flow change.  I've decided to stick with the name
"harden-sls-retbr" as introduced for the corresponding AArch64
mitigation.

This patch implements this for ARM mode. A future patch will extend this
to also support Thumb mode.

The inserted barriers are never on the correct, architectural execution
path, and therefore performance overhead of this is expected to be low.
To ensure these barriers are never on an architecturally executed path,
when the harden-sls-retbr function attribute is present, indirect
control flow is never conditionalized/predicated.

On targets that implement that Armv8.0-SB Speculation Barrier extension,
a single SB instruction is emitted that acts as a speculation barrier.
On other targets, a DSB SYS followed by a ISB is emitted to act as a
speculation barrier.

These speculation barriers are implemented as pseudo instructions to
avoid later passes to analyze them and potentially remove them.

The mitigation is off by default and can be enabled by the
harden-sls-retbr subtarget feature.

Differential Revision: https://reviews.llvm.org/D92395

Added: 
llvm/lib/Target/ARM/ARMSLSHardening.cpp
llvm/test/CodeGen/ARM/speculation-hardening-sls.ll

Modified: 
llvm/lib/Target/ARM/ARM.h
llvm/lib/Target/ARM/ARM.td
llvm/lib/Target/ARM/ARMAsmPrinter.cpp
llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
llvm/lib/Target/ARM/ARMBaseInstrInfo.h
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp
llvm/lib/Target/ARM/ARMInstrInfo.td
llvm/lib/Target/ARM/ARMSubtarget.h
llvm/lib/Target/ARM/ARMTargetMachine.cpp
llvm/lib/Target/ARM/CMakeLists.txt
llvm/test/CodeGen/ARM/O3-pipeline.ll

Removed: 




diff  --git a/llvm/lib/Target/ARM/ARM.h b/llvm/lib/Target/ARM/ARM.h
index 7398968bb24a..51dfaaa96892 100644
--- a/llvm/lib/Target/ARM/ARM.h
+++ b/llvm/lib/Target/ARM/ARM.h
@@ -55,6 +55,7 @@ InstructionSelector *
 createARMInstructionSelector(const ARMBaseTargetMachine &TM, const 
ARMSubtarget &STI,
  const ARMRegisterBankInfo &RBI);
 Pass *createMVEGatherScatterLoweringPass();
+FunctionPass *createARMSLSHardeningPass();
 
 void LowerARMMachineInstrToMCInst(const MachineInstr *MI, MCInst &OutMI,
   ARMAsmPrinter &AP);
@@ -71,6 +72,7 @@ void initializeMVEVPTOptimisationsPass(PassRegistry &);
 void initializeARMLowOverheadLoopsPass(PassRegistry &);
 void initializeMVETailPredicationPass(PassRegistry &);
 void initializeMVEGatherScatterLoweringPass(PassRegistry &);
+void initializeARMSLSHardeningPass(PassRegistry &);
 
 } // end namespace llvm
 

diff  --git a/llvm/lib/Target/ARM/ARM.td b/llvm/lib/Target/ARM/ARM.td
index 3fa65289744e..4d4ace51e13f 100644
--- a/llvm/lib/Target/ARM/ARM.td
+++ b/llvm/lib/Target/ARM/ARM.td
@@ -562,6 +562,16 @@ foreach i = {0-7} in
   "Coprocessor "#i#" ISA is CDEv1",
   [HasCDEOps]>;
 
+//===--===//
+// Control codegen mitigation against Straight Line Speculation vulnerability.
+//===--===//
+
+def FeatureHardenSlsRetBr : SubtargetFeature<"harden-sls-retbr",
+  "HardenSlsRetBr", "true",
+  "Harden against straight line speculation across RETurn and BranchRegister "
+  "instructions">;
+
+
 
//===--===//
 // ARM Processor subtarget features.
 //

diff  --git a/llvm/lib/Target/ARM/ARMAsmPrinter.cpp 
b/llvm/lib/Target

[llvm-branch-commits] [llvm] 320fd33 - [ARM] Implement harden-sls-retbr for Thumb mode

2020-12-19 Thread Kristof Beyls via llvm-branch-commits

Author: Kristof Beyls
Date: 2020-12-19T12:32:47Z
New Revision: 320fd3314e378ae6242a2dde97250a8a94d68e27

URL: 
https://github.com/llvm/llvm-project/commit/320fd3314e378ae6242a2dde97250a8a94d68e27
DIFF: 
https://github.com/llvm/llvm-project/commit/320fd3314e378ae6242a2dde97250a8a94d68e27.diff

LOG: [ARM] Implement harden-sls-retbr for Thumb mode

The only non-trivial consideration in this patch is that the formation
of TBB/TBH instructions, which is done in the constant island pass, does
not understand the speculation barriers inserted by the SLSHardening
pass. As such, when harden-sls-retbr is enabled for a function, the
formation of TBB/TBH instructions in the constant island pass is
disabled.

Differential Revision: https://reviews.llvm.org/D92396

Added: 


Modified: 
llvm/lib/Target/ARM/ARMAsmPrinter.cpp
llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
llvm/lib/Target/ARM/ARMBaseInstrInfo.h
llvm/lib/Target/ARM/ARMConstantIslandPass.cpp
llvm/lib/Target/ARM/ARMInstrThumb2.td
llvm/lib/Target/ARM/ARMSLSHardening.cpp
llvm/test/CodeGen/ARM/speculation-hardening-sls.ll

Removed: 




diff  --git a/llvm/lib/Target/ARM/ARMAsmPrinter.cpp 
b/llvm/lib/Target/ARM/ARMAsmPrinter.cpp
index 4cc85ad82d51..dd22e5dfe6e1 100644
--- a/llvm/lib/Target/ARM/ARMAsmPrinter.cpp
+++ b/llvm/lib/Target/ARM/ARMAsmPrinter.cpp
@@ -2192,6 +2192,22 @@ void ARMAsmPrinter::emitInstruction(const MachineInstr 
*MI) {
 EmitToStreamer(*OutStreamer, TmpInstISB);
 return;
   }
+  case ARM::t2SpeculationBarrierISBDSBEndBB: {
+// Print DSB SYS + ISB
+MCInst TmpInstDSB;
+TmpInstDSB.setOpcode(ARM::t2DSB);
+TmpInstDSB.addOperand(MCOperand::createImm(0xf));
+TmpInstDSB.addOperand(MCOperand::createImm(ARMCC::AL));
+TmpInstDSB.addOperand(MCOperand::createReg(0));
+EmitToStreamer(*OutStreamer, TmpInstDSB);
+MCInst TmpInstISB;
+TmpInstISB.setOpcode(ARM::t2ISB);
+TmpInstISB.addOperand(MCOperand::createImm(0xf));
+TmpInstISB.addOperand(MCOperand::createImm(ARMCC::AL));
+TmpInstISB.addOperand(MCOperand::createReg(0));
+EmitToStreamer(*OutStreamer, TmpInstISB);
+return;
+  }
   case ARM::SpeculationBarrierSBEndBB: {
 // Print SB
 MCInst TmpInstSB;
@@ -2199,6 +2215,13 @@ void ARMAsmPrinter::emitInstruction(const MachineInstr 
*MI) {
 EmitToStreamer(*OutStreamer, TmpInstSB);
 return;
   }
+  case ARM::t2SpeculationBarrierSBEndBB: {
+// Print SB
+MCInst TmpInstSB;
+TmpInstSB.setOpcode(ARM::t2SB);
+EmitToStreamer(*OutStreamer, TmpInstSB);
+return;
+  }
   }
 
   MCInst TmpInst;

diff  --git a/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp 
b/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
index 7068da5eb004..1435bba776a3 100644
--- a/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
+++ b/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
@@ -775,9 +775,11 @@ unsigned ARMBaseInstrInfo::getInstSizeInBytes(const 
MachineInstr &MI) const {
 return Size;
   }
   case ARM::SpeculationBarrierISBDSBEndBB:
+  case ARM::t2SpeculationBarrierISBDSBEndBB:
 // This gets lowered to 2 4-byte instructions.
 return 8;
   case ARM::SpeculationBarrierSBEndBB:
+  case ARM::t2SpeculationBarrierSBEndBB:
 // This gets lowered to 1 4-byte instructions.
 return 4;
   }

diff  --git a/llvm/lib/Target/ARM/ARMBaseInstrInfo.h 
b/llvm/lib/Target/ARM/ARMBaseInstrInfo.h
index 51a4b44eae1d..e4e71e4925b9 100644
--- a/llvm/lib/Target/ARM/ARMBaseInstrInfo.h
+++ b/llvm/lib/Target/ARM/ARMBaseInstrInfo.h
@@ -643,7 +643,9 @@ static inline bool isIndirectControlFlowNotComingBack(const 
MachineInstr &MI) {
 
 static inline bool isSpeculationBarrierEndBBOpcode(int Opc) {
   return Opc == ARM::SpeculationBarrierISBDSBEndBB ||
- Opc == ARM::SpeculationBarrierSBEndBB;
+ Opc == ARM::SpeculationBarrierSBEndBB ||
+ Opc == ARM::t2SpeculationBarrierISBDSBEndBB ||
+ Opc == ARM::t2SpeculationBarrierSBEndBB;
 }
 
 static inline bool isPopOpcode(int Opc) {

diff  --git a/llvm/lib/Target/ARM/ARMConstantIslandPass.cpp 
b/llvm/lib/Target/ARM/ARMConstantIslandPass.cpp
index 77839710e03e..da7bf6170255 100644
--- a/llvm/lib/Target/ARM/ARMConstantIslandPass.cpp
+++ b/llvm/lib/Target/ARM/ARMConstantIslandPass.cpp
@@ -359,6 +359,10 @@ bool 
ARMConstantIslands::runOnMachineFunction(MachineFunction &mf) {
   isThumb2 = AFI->isThumb2Function();
 
   bool GenerateTBB = isThumb2 || (isThumb1 && SynthesizeThumb1TBB);
+  // TBB generation code in this constant island pass has not been adapted to
+  // deal with speculation barriers.
+  if (STI->hardenSlsRetBr())
+GenerateTBB = false;
 
   // Renumber all of the machine basic blocks in the function, guaranteeing 
that
   // the numbers agree with the position of the block in the function.

diff  --git a/llvm/lib/Target/ARM/ARMInstrThumb2.td 
b/llvm/lib/Target/ARM/ARMInstrThumb2.td
index 52da88dab632..f83807d27f88 100644
--- a/llvm/lib/Target/ARM/ARMInstrT

[llvm-branch-commits] [llvm] a4c1f51 - [ARM] Harden indirect calls against SLS

2020-12-19 Thread Kristof Beyls via llvm-branch-commits

Author: Kristof Beyls
Date: 2020-12-19T12:33:42Z
New Revision: a4c1f5160e6d1de9a9959ecbf329c2acf4f3ed31

URL: 
https://github.com/llvm/llvm-project/commit/a4c1f5160e6d1de9a9959ecbf329c2acf4f3ed31
DIFF: 
https://github.com/llvm/llvm-project/commit/a4c1f5160e6d1de9a9959ecbf329c2acf4f3ed31.diff

LOG: [ARM] Harden indirect calls against SLS

To make sure that no barrier gets placed on the architectural execution
path, each indirect call calling the function in register rN, it gets
transformed to a direct call to __llvm_slsblr_thunk_mode_rN.  mode is
either arm or thumb, depending on the mode of where the indirect call
happens.

The llvm_slsblr_thunk_mode_rN thunk contains:

bx rN


Therefore, the indirect call gets split into 2; one direct call and one
indirect jump.
This transformation results in not inserting a speculation barrier on
the architectural execution path.

The mitigation is off by default and can be enabled by the
harden-sls-blr subtarget feature.

As a linker is allowed to clobber r12 on function calls, the
above code transformation is not correct in case a linker does so.
Similarly, the transformation is not correct when register lr is used.
Avoiding r12/lr being used is done in a follow-on patch to make
reviewing this code easier.

Differential Revision: https://reviews.llvm.org/D92468

Added: 


Modified: 
llvm/lib/Target/ARM/ARM.h
llvm/lib/Target/ARM/ARM.td
llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
llvm/lib/Target/ARM/ARMBaseInstrInfo.h
llvm/lib/Target/ARM/ARMSLSHardening.cpp
llvm/lib/Target/ARM/ARMSubtarget.h
llvm/lib/Target/ARM/ARMTargetMachine.cpp
llvm/test/CodeGen/ARM/O3-pipeline.ll
llvm/test/CodeGen/ARM/speculation-hardening-sls.ll

Removed: 




diff  --git a/llvm/lib/Target/ARM/ARM.h b/llvm/lib/Target/ARM/ARM.h
index 51dfaaa96892..d8a4e4c31012 100644
--- a/llvm/lib/Target/ARM/ARM.h
+++ b/llvm/lib/Target/ARM/ARM.h
@@ -56,6 +56,7 @@ createARMInstructionSelector(const ARMBaseTargetMachine &TM, 
const ARMSubtarget
  const ARMRegisterBankInfo &RBI);
 Pass *createMVEGatherScatterLoweringPass();
 FunctionPass *createARMSLSHardeningPass();
+FunctionPass *createARMIndirectThunks();
 
 void LowerARMMachineInstrToMCInst(const MachineInstr *MI, MCInst &OutMI,
   ARMAsmPrinter &AP);

diff  --git a/llvm/lib/Target/ARM/ARM.td b/llvm/lib/Target/ARM/ARM.td
index 4d4ace51e13f..8111346c74f6 100644
--- a/llvm/lib/Target/ARM/ARM.td
+++ b/llvm/lib/Target/ARM/ARM.td
@@ -570,6 +570,10 @@ def FeatureHardenSlsRetBr : 
SubtargetFeature<"harden-sls-retbr",
   "HardenSlsRetBr", "true",
   "Harden against straight line speculation across RETurn and BranchRegister "
   "instructions">;
+def FeatureHardenSlsBlr : SubtargetFeature<"harden-sls-blr",
+  "HardenSlsBlr", "true",
+  "Harden against straight line speculation across indirect calls">;
+
 
 
 
//===--===//

diff  --git a/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp 
b/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
index 1435bba776a3..9a71b9264fcd 100644
--- a/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
+++ b/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
@@ -691,6 +691,8 @@ bool ARMBaseInstrInfo::isPredicable(const MachineInstr &MI) 
const {
   const ARMSubtarget &ST = MF->getSubtarget();
   if (ST.hardenSlsRetBr() && isIndirectControlFlowNotComingBack(MI))
 return false;
+  if (ST.hardenSlsBlr() && isIndirectCall(MI))
+return false;
 
   if (AFI->isThumb2Function()) {
 if (getSubtarget().restrictIT())

diff  --git a/llvm/lib/Target/ARM/ARMBaseInstrInfo.h 
b/llvm/lib/Target/ARM/ARMBaseInstrInfo.h
index e4e71e4925b9..47a2cf44f3a9 100644
--- a/llvm/lib/Target/ARM/ARMBaseInstrInfo.h
+++ b/llvm/lib/Target/ARM/ARMBaseInstrInfo.h
@@ -635,6 +635,51 @@ bool isIndirectBranchOpcode(int Opc) {
   return Opc == ARM::BX || Opc == ARM::MOVPCRX || Opc == ARM::tBRIND;
 }
 
+static inline bool isIndirectCall(const MachineInstr &MI) {
+  int Opc = MI.getOpcode();
+  switch (Opc) {
+// indirect calls:
+  case ARM::BLX:
+  case ARM::BLX_pred:
+  case ARM::BX_CALL:
+  case ARM::BMOVPCRX_CALL:
+  case ARM::TCRETURNri:
+  case ARM::TAILJMPr:
+  case ARM::TAILJMPr4:
+  case ARM::tBLXr:
+  case ARM::tBLXNSr:
+  case ARM::tBLXNS_CALL:
+  case ARM::tBX_CALL:
+  case ARM::tTAILJMPr:
+assert(MI.isCall(MachineInstr::IgnoreBundle));
+return true;
+// direct calls:
+  case ARM::BL:
+  case ARM::BL_pred:
+  case ARM::BMOVPCB_CALL:
+  case ARM::BL_PUSHLR:
+  case ARM::BLXi:
+  case ARM::TCRETURNdi:
+  case ARM::TAILJMPd:
+  case ARM::SVC:
+  case ARM::HVC:
+  case ARM::TPsoft:
+  case ARM::tTAILJMPd:
+  case ARM::t2SMC:
+  case ARM::t2HVC:
+  case ARM::tBL:
+  case ARM::tBLXi:
+  case ARM::tBL_PUSHLR:
+  case ARM::tTAILJMPdND:
+  case ARM::tSVC:
+  case ARM::tTPsoft:
+assert(MI.isCall(MachineInstr::IgnoreBundle));
+return false;

[llvm-branch-commits] [llvm] df8ed39 - [ARM] harden-sls-blr: avoid r12 and lr in indirect calls.

2020-12-19 Thread Kristof Beyls via llvm-branch-commits

Author: Kristof Beyls
Date: 2020-12-19T12:39:59Z
New Revision: df8ed3928377edc6e9241a56680b694ffa9f4d6d

URL: 
https://github.com/llvm/llvm-project/commit/df8ed3928377edc6e9241a56680b694ffa9f4d6d
DIFF: 
https://github.com/llvm/llvm-project/commit/df8ed3928377edc6e9241a56680b694ffa9f4d6d.diff

LOG: [ARM] harden-sls-blr: avoid r12 and lr in indirect calls.

As a linker is allowed to clobber r12 on function calls, the code
transformation that hardens indirect calls is not correct in case a
linker does so.  Similarly, the transformation is not correct when
register lr is used.

This patch makes sure that r12 or lr are not used for indirect calls
when harden-sls-blr is enabled.

Differential Revision: https://reviews.llvm.org/D92469

Added: 


Modified: 
llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
llvm/lib/Target/ARM/ARMBaseInstrInfo.h
llvm/lib/Target/ARM/ARMCallLowering.cpp
llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp
llvm/lib/Target/ARM/ARMFastISel.cpp
llvm/lib/Target/ARM/ARMFeatures.h
llvm/lib/Target/ARM/ARMISelLowering.cpp
llvm/lib/Target/ARM/ARMInstrInfo.td
llvm/lib/Target/ARM/ARMInstrThumb.td
llvm/lib/Target/ARM/ARMPredicates.td
llvm/lib/Target/ARM/ARMRegisterBankInfo.cpp
llvm/lib/Target/ARM/ARMRegisterInfo.td
llvm/lib/Target/ARM/ARMSLSHardening.cpp
llvm/test/CodeGen/ARM/speculation-hardening-sls.ll

Removed: 




diff  --git a/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp 
b/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
index 9a71b9264fcd..e77ed2c34bd3 100644
--- a/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
+++ b/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
@@ -5803,7 +5803,9 @@ outliner::OutlinedFunction 
ARMBaseInstrInfo::getOutliningCandidateInfo(
 NumBytesToCreateFrame = Costs.FrameTailCall;
 SetCandidateCallInfo(MachineOutlinerTailCall, Costs.CallTailCall);
   } else if (LastInstrOpcode == ARM::BL || LastInstrOpcode == ARM::BLX ||
- LastInstrOpcode == ARM::tBL || LastInstrOpcode == ARM::tBLXr ||
+ LastInstrOpcode == ARM::BLX_noip || LastInstrOpcode == ARM::tBL ||
+ LastInstrOpcode == ARM::tBLXr ||
+ LastInstrOpcode == ARM::tBLXr_noip ||
  LastInstrOpcode == ARM::tBLXi) {
 FrameID = MachineOutlinerThunk;
 NumBytesToCreateFrame = Costs.FrameThunk;
@@ -6051,7 +6053,8 @@ 
ARMBaseInstrInfo::getOutliningType(MachineBasicBlock::iterator &MIT,
 // we don't get unexpected results with call pseudo-instructions.
 auto UnknownCallOutlineType = outliner::InstrType::Illegal;
 if (Opc == ARM::BL || Opc == ARM::tBL || Opc == ARM::BLX ||
-Opc == ARM::tBLXr || Opc == ARM::tBLXi)
+Opc == ARM::BLX_noip || Opc == ARM::tBLXr || Opc == ARM::tBLXr_noip ||
+Opc == ARM::tBLXi)
   UnknownCallOutlineType = outliner::InstrType::LegalTerminator;
 
 if (!Callee)
@@ -6343,3 +6346,19 @@ bool 
ARMBaseInstrInfo::isReallyTriviallyReMaterializable(const MachineInstr &MI,
   // spill/restore and VPT predication.
   return isVCTP(&MI) && !isPredicated(MI);
 }
+
+unsigned llvm::getBLXOpcode(const MachineFunction &MF) {
+  return (MF.getSubtarget().hardenSlsBlr()) ? ARM::BLX_noip
+  : ARM::BLX;
+}
+
+unsigned llvm::gettBLXrOpcode(const MachineFunction &MF) {
+  return (MF.getSubtarget().hardenSlsBlr()) ? ARM::tBLXr_noip
+  : ARM::tBLXr;
+}
+
+unsigned llvm::getBLXpredOpcode(const MachineFunction &MF) {
+  return (MF.getSubtarget().hardenSlsBlr()) ? ARM::BLX_pred_noip
+  : ARM::BLX_pred;
+}
+

diff  --git a/llvm/lib/Target/ARM/ARMBaseInstrInfo.h 
b/llvm/lib/Target/ARM/ARMBaseInstrInfo.h
index 47a2cf44f3a9..9b6572848ebe 100644
--- a/llvm/lib/Target/ARM/ARMBaseInstrInfo.h
+++ b/llvm/lib/Target/ARM/ARMBaseInstrInfo.h
@@ -640,13 +640,16 @@ static inline bool isIndirectCall(const MachineInstr &MI) 
{
   switch (Opc) {
 // indirect calls:
   case ARM::BLX:
+  case ARM::BLX_noip:
   case ARM::BLX_pred:
+  case ARM::BLX_pred_noip:
   case ARM::BX_CALL:
   case ARM::BMOVPCRX_CALL:
   case ARM::TCRETURNri:
   case ARM::TAILJMPr:
   case ARM::TAILJMPr4:
   case ARM::tBLXr:
+  case ARM::tBLXr_noip:
   case ARM::tBLXNSr:
   case ARM::tBLXNS_CALL:
   case ARM::tBX_CALL:
@@ -908,6 +911,10 @@ inline bool isGatherScatter(IntrinsicInst *IntInst) {
   return isGather(IntInst) || isScatter(IntInst);
 }
 
+unsigned getBLXOpcode(const MachineFunction &MF);
+unsigned gettBLXrOpcode(const MachineFunction &MF);
+unsigned getBLXpredOpcode(const MachineFunction &MF);
+
 } // end namespace llvm
 
 #endif // LLVM_LIB_TARGET_ARM_ARMBASEINSTRINFO_H

diff  --git a/llvm/lib/Target/ARM/ARMCallLowering.cpp 
b/llvm/lib/Target/ARM/ARMCallLowering.cpp
index 2efc26203f58..0a38f737cb4b 100644
--- a/llvm/lib/Target/ARM/ARMCallLowering.cpp
+++ b/llvm/lib/Target/ARM/ARMCal

[llvm-branch-commits] [clang] 9c895ae - [ARM] Add clang command line support for -mharden-sls=

2020-12-19 Thread Kristof Beyls via llvm-branch-commits

Author: Kristof Beyls
Date: 2020-12-19T12:49:26Z
New Revision: 9c895aea118a2f50ca8413372363c3ff6ecc21bf

URL: 
https://github.com/llvm/llvm-project/commit/9c895aea118a2f50ca8413372363c3ff6ecc21bf
DIFF: 
https://github.com/llvm/llvm-project/commit/9c895aea118a2f50ca8413372363c3ff6ecc21bf.diff

LOG: [ARM] Add clang command line support for -mharden-sls=

The command line syntax is identical to the -mharden-sls= command line
syntax for AArch64 targets.

Differential Revision: https://reviews.llvm.org/D93221

Added: 
clang/test/Driver/sls-hardening-options.c

Modified: 
clang/include/clang/Basic/DiagnosticDriverKinds.td
clang/lib/Driver/ToolChains/Arch/ARM.cpp
clang/lib/Driver/ToolChains/Arch/ARM.h

Removed: 
clang/test/Driver/aarch64-sls-hardening-options.c



diff  --git a/clang/include/clang/Basic/DiagnosticDriverKinds.td 
b/clang/include/clang/Basic/DiagnosticDriverKinds.td
index c67cce099a28..e92a4bf1dac5 100644
--- a/clang/include/clang/Basic/DiagnosticDriverKinds.td
+++ b/clang/include/clang/Basic/DiagnosticDriverKinds.td
@@ -348,6 +348,8 @@ def err_invalid_branch_protection: Error <
   "invalid branch protection option '%0' in '%1'">;
 def err_invalid_sls_hardening : Error<
   "invalid sls hardening option '%0' in '%1'">;
+def err_sls_hardening_arm_not_supported : Error<
+  "-mharden-sls is only supported on armv7-a or later">;
 
 def note_drv_command_failed_diag_msg : Note<
   "diagnostic msg: %0">;

diff  --git a/clang/lib/Driver/ToolChains/Arch/ARM.cpp 
b/clang/lib/Driver/ToolChains/Arch/ARM.cpp
index 309a7298300f..ef590db1eecd 100644
--- a/clang/lib/Driver/ToolChains/Arch/ARM.cpp
+++ b/clang/lib/Driver/ToolChains/Arch/ARM.cpp
@@ -32,6 +32,12 @@ bool arm::isARMMProfile(const llvm::Triple &Triple) {
   return llvm::ARM::parseArchProfile(Arch) == llvm::ARM::ProfileKind::M;
 }
 
+// True if A-profile.
+bool arm::isARMAProfile(const llvm::Triple &Triple) {
+  llvm::StringRef Arch = Triple.getArchName();
+  return llvm::ARM::parseArchProfile(Arch) == llvm::ARM::ProfileKind::A;
+}
+
 // Get Arch/CPU from args.
 void arm::getARMArchCPUFromArgs(const ArgList &Args, llvm::StringRef &Arch,
 llvm::StringRef &CPU, bool FromAs) {
@@ -606,6 +612,45 @@ void arm::getARMTargetFeatures(const Driver &D, const 
llvm::Triple &Triple,
 
   if (Args.hasArg(options::OPT_mno_neg_immediates))
 Features.push_back("+no-neg-immediates");
+
+  // Enable/disable straight line speculation hardening.
+  if (Arg *A = Args.getLastArg(options::OPT_mharden_sls_EQ)) {
+StringRef Scope = A->getValue();
+bool EnableRetBr = false;
+bool EnableBlr = false;
+if (Scope != "none" && Scope != "all") {
+  SmallVector Opts;
+  Scope.split(Opts, ",");
+  for (auto Opt : Opts) {
+Opt = Opt.trim();
+if (Opt == "retbr") {
+  EnableRetBr = true;
+  continue;
+}
+if (Opt == "blr") {
+  EnableBlr = true;
+  continue;
+}
+D.Diag(diag::err_invalid_sls_hardening)
+<< Scope << A->getAsString(Args);
+break;
+  }
+} else if (Scope == "all") {
+  EnableRetBr = true;
+  EnableBlr = true;
+}
+
+if (EnableRetBr || EnableBlr)
+  if (!(isARMAProfile(Triple) && getARMSubArchVersionNumber(Triple) >= 7))
+D.Diag(diag::err_sls_hardening_arm_not_supported)
+<< Scope << A->getAsString(Args);
+
+if (EnableRetBr)
+  Features.push_back("+harden-sls-retbr");
+if (EnableBlr)
+  Features.push_back("+harden-sls-blr");
+  }
+
 }
 
 const std::string arm::getARMArch(StringRef Arch, const llvm::Triple &Triple) {

diff  --git a/clang/lib/Driver/ToolChains/Arch/ARM.h 
b/clang/lib/Driver/ToolChains/Arch/ARM.h
index 091c09b160ae..02d91cdaee13 100644
--- a/clang/lib/Driver/ToolChains/Arch/ARM.h
+++ b/clang/lib/Driver/ToolChains/Arch/ARM.h
@@ -63,6 +63,7 @@ void getARMTargetFeatures(const Driver &D, const llvm::Triple 
&Triple,
   std::vector &Features, bool ForAS);
 int getARMSubArchVersionNumber(const llvm::Triple &Triple);
 bool isARMMProfile(const llvm::Triple &Triple);
+bool isARMAProfile(const llvm::Triple &Triple);
 
 } // end namespace arm
 } // end namespace tools

diff  --git a/clang/test/Driver/aarch64-sls-hardening-options.c 
b/clang/test/Driver/aarch64-sls-hardening-options.c
deleted file mode 100644
index 250007aa1254..
--- a/clang/test/Driver/aarch64-sls-hardening-options.c
+++ /dev/null
@@ -1,45 +0,0 @@
-// Check the -mharden-sls= option, which has a required argument to select
-// scope.
-// RUN: %clang -target aarch64--none-eabi -c %s -### 2>&1 | \
-// RUN: FileCheck %s --check-prefix=RETBR-OFF --check-prefix=BLR-OFF
-
-// RUN: %clang -target aarch64--none-eabi -c %s -### -mharden-sls=none 2>&1 | \
-// RUN: FileCheck %s --check-prefix=RETBR-OFF --check-prefix=BLR-OFF
-
-// RUN: %clang -target aarch64--none-eabi

[llvm-branch-commits] [llvm] d98e4c0 - Add a few more release notes for ARM and AArch64.

2020-08-31 Thread Kristof Beyls via llvm-branch-commits

Author: Kristof Beyls
Date: 2020-08-31T14:10:54+02:00
New Revision: d98e4c0d9a3585e2302c717beea9b9d03df9663a

URL: 
https://github.com/llvm/llvm-project/commit/d98e4c0d9a3585e2302c717beea9b9d03df9663a
DIFF: 
https://github.com/llvm/llvm-project/commit/d98e4c0d9a3585e2302c717beea9b9d03df9663a.diff

LOG: Add a few more release notes for ARM and AArch64.

Added: 


Modified: 
llvm/docs/ReleaseNotes.rst

Removed: 




diff  --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index c7ca861dbc34..8171f9d990c9 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -119,11 +119,26 @@ Changes to the AArch64 Backend
   ``00bet5``). For more information, see the ``clang`` 11 release
   notes.
 
+* Added support for Armv8.6-A:
+
+  Assembly support for the following extensions:
+  - Enhanced Counter Virtualization (ARMv8.6-ECV).
+  - Fine Grained Traps (ARMv8.6-FGT).
+  - Activity Monitors virtualization (ARMv8.6-AMU).
+  - Data gathering hint (ARMv8.0-DGH).
+
+  Assembly and intrinsics support for the Armv8.6-A Matrix Multiply extension
+  for Neon and SVE vectors.
+
+  Support for the ARMv8.2-BF16 BFloat16 extension. This includes a new C-level
+  storage-only `__bf16` type, a `BFloat` IR type, a `bf16` MVT, and assembly
+  and intrinsics support.
+
+* Added support for Cortex-A34, Cortex-A77, Cortex-A78 and Cortex-X1 cores.
+
 Changes to the ARM Backend
 --
 
-During this release ...
-
 * Implemented C-language intrinsics for the full Arm v8.1-M MVE instruction
   set.  now supports the complete API defined in the Arm C
   Language Extensions.
@@ -139,6 +154,19 @@ During this release ...
   default may wish to specify ``-fno-omit-frame-pointer`` to get the old
   behavior. This improves compatibility with GCC.
 
+* Added support for Armv8.6-A:
+
+  Assembly and intrinsics support for the Armv8.6-A Matrix Multiply extension
+  for Neon vectors.
+
+  Support for the ARMv8.2-AA32BF16 BFloat16 extension. This includes a new
+  C-level storage-only `__bf16` type, a `BFloat` IR type, a `bf16` MVT, and
+  assembly and intrinsics support.
+
+* Added support for CMSE.
+
+* Added support for Cortex-M55, Cortex-A77, Cortex-A78 and Cortex-X1 cores.
+
 Changes to the MIPS Target
 --
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 424fdbc - collect_and_build_with_pgo.py: adapt to monorepo

2020-12-01 Thread Kristof Beyls via llvm-branch-commits

Author: Kristof Beyls
Date: 2020-12-01T09:16:12+01:00
New Revision: 424fdbc3dedadf882fda107aa5638b39e7036c4d

URL: 
https://github.com/llvm/llvm-project/commit/424fdbc3dedadf882fda107aa5638b39e7036c4d
DIFF: 
https://github.com/llvm/llvm-project/commit/424fdbc3dedadf882fda107aa5638b39e7036c4d.diff

LOG: collect_and_build_with_pgo.py: adapt to monorepo

Differential Revision: https://reviews.llvm.org/D92328

Added: 


Modified: 
llvm/utils/collect_and_build_with_pgo.py

Removed: 




diff  --git a/llvm/utils/collect_and_build_with_pgo.py 
b/llvm/utils/collect_and_build_with_pgo.py
index 5a8686a88b4f..e9f82617f4e9 100755
--- a/llvm/utils/collect_and_build_with_pgo.py
+++ b/llvm/utils/collect_and_build_with_pgo.py
@@ -146,9 +146,9 @@ def output_subdir(self, name):
 
 def has_llvm_subproject(self, name):
 if name == 'compiler-rt':
-subdir = 'projects/compiler-rt'
+subdir = '../compiler-rt'
 elif name == 'clang':
-subdir = 'tools/clang'
+subdir = '../clang'
 else:
 raise ValueError('Unknown subproject: %s' % name)
 
@@ -161,9 +161,8 @@ def run_command(self,
 cwd=None,
 check=False,
 silent_unless_error=False):
-cmd_str = ' '.join(shlex.quote(s) for s in cmd)
 print(
-'Running `%s` in %s' % (cmd_str, shlex.quote(cwd or os.getcwd(
+'Running `%s` in %s' % (cmd, shlex.quote(cwd or os.getcwd(
 
 if self.dry_run:
 return
@@ -372,7 +371,8 @@ def _parse_args():
 else:
 output_dir = os.path.abspath(args.out_dir)
 
-extra_args = {'CMAKE_BUILD_TYPE': 'Release'}
+extra_args = {'CMAKE_BUILD_TYPE': 'Release',
+  'LLVM_ENABLE_PROJECTS': 'clang;compiler-rt;lld'}
 for arg in args.cmake_extra_arg:
 if arg.startswith('-D'):
 arg = arg[2:]



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] compiler-rt: Introduce runtime functions for emulated PAC. (PR #133530)

2025-04-05 Thread Kristof Beyls via llvm-branch-commits


@@ -0,0 +1,7343 @@
+/*
+ * xxHash - Extremely Fast Hash algorithm
+ * Header File
+ * Copyright (C) 2012-2023 Yann Collet
+ *
+ * BSD 2-Clause License (https://www.opensource.org/licenses/bsd-license.php)

kbeyls wrote:

This is a license different from Apache-2.0 WITH LLVM-exception.
Therefore, the process described at 
https://llvm.org/docs/DeveloperPolicy.html#copyright-license-and-patents should 
be followed to check whether this is acceptable in this specific case.

That being said, xxhash is already present under 
[llvm/lib/Support/xxhash.cpp](https://github.com/llvm/llvm-project/blob/21eeca3db0341fef4ab4a6464ffe38b2eba5810c/llvm/lib/Support/xxhash.cpp#L163),
 as you pointed out in the 
[RFC](https://discourse.llvm.org/t/rfc-emulated-pac/85557).

Making sure we don't have multiple copies of non-Apache-2.0 WITH LLVM-exception 
code would be preferable. I'll tag @beanz, as he had ideas about how to better 
structure vendored third party code in LLVM.

I'm not sure if moving non-Apache-2.0 WITH LLVM-exception licensed code to a 
run-time library (for the first time?) triggers new concerns.

I'll note that other hashing algorithms, such as 
[Blake3](https://github.com/llvm/llvm-project/blob/21eeca3db0341fef4ab4a6464ffe38b2eba5810c/llvm/include/llvm/Support/BLAKE3.h#L1)
 and 
[SipHash](https://github.com/llvm/llvm-project/blob/21eeca3db0341fef4ab4a6464ffe38b2eba5810c/llvm/lib/Support/SipHash.cpp#L1),
 which are available under the Apache-2.0 WITH LLVM-exception, are also already 
present in the LLVM Support library.
Would one of these preferably-licensed hashing algorithms be a good fit for the 
hashing functionality needed for this use case?



https://github.com/llvm/llvm-project/pull/133530
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: Detect address materialization and arithmetics (PR #132540)

2025-04-04 Thread Kristof Beyls via llvm-branch-commits


@@ -304,6 +304,36 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  MCPhysReg getSafelyMaterializedAddressReg(const MCInst &Inst) const override 
{
+switch (Inst.getOpcode()) {
+case AArch64::ADR:
+case AArch64::ADRP:

kbeyls wrote:

Maybe there could be a comment here about why `ADR` and `ADRP` produce a 
"safely materialized address register"? For example, "These instructions 
produce an address value in the destination register, based only on information 
in parts of the instruction encoding, i.e. based on information from read-only 
code memory. Therefore, the value in the register it writes is safe according 
to the assumed threat model"

https://github.com/llvm/llvm-project/pull/132540
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect non-protected indirect calls (PR #131899)

2025-03-25 Thread Kristof Beyls via llvm-branch-commits


@@ -577,6 +577,16 @@ class MCPlusBuilder {
 return getNoRegister();
   }
 
+  /// Returns the register used as call destination, or no-register, if not
+  /// an indirect call. Sets IsAuthenticatedInternally if the instruction
+  /// accepts signed pointer as its operand and authenticates it internally.

kbeyls wrote:

maybe s/accepts signed pointer/accepts a signed pointer/?

https://github.com/llvm/llvm-project/pull/131899
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect non-protected indirect calls (PR #131899)

2025-03-25 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/131899
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-07 Thread Kristof Beyls via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or

kbeyls wrote:

I think that "trusted" isn't defined before this sentence?
I'm assuming that "trusted" means "has been authenticated and program 
termination is guaranteed if authentication failed"?
Maybe we should have a specific term for that?
It seems to me that "trusted" might be a too-generic term, and becomes too 
confusing to use for this.
Off the top of my head, I'm thinking maybe "fully-authenticated" might work? 
Not sure if @jacobbramley has a better suggestion?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: analyze functions without CFG information (PR #133461)

2025-04-10 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/133461
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-07 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-07 Thread Kristof Beyls via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.

kbeyls wrote:

either "a signing..." or "oracles"?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-07 Thread Kristof Beyls via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.

kbeyls wrote:

"a pointer being valid" could mean many things in this target-agnostic header 
file.
Maybe the words "pointer authentication" and "correctly authenticated" or 
something similar should be used in this first sentence?
Maybe
`/// Analyzes if this basic block fully authenticates a signed pointer, 
including triggering program termination when invalidly signed` or something 
along those lines?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-07 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls commented:

I only reviewed a small part of this PR so far, but will probably not have more 
time today or tomorrow to review further, so I thought I'd share my thoughts so 
far.

I think that the documentation at 
https://github.com/llvm/llvm-project/blob/main/bolt/docs/BinaryAnalysis.md 
should probably be updated to describe the added functionality in this, and 
maybe also some earlier, PRs.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-07 Thread Kristof Beyls via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or
+  /// * the program is terminated abnormally without introducing any signing
+  ///   or authentication oracles
+  virtual std::optional>

kbeyls wrote:

Before reading the rest of this patch, I would guess that in theory there could 
be multiple such (Reg, Inst) pairs in a single basic block? If so, should this 
return a list/vector/... instead of an optional?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-18 Thread Kristof Beyls via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or
+  /// * the program is terminated abnormally without introducing any signing
+  ///   or authentication oracles
+  virtual std::optional>

kbeyls wrote:

Off the top of my head, for the following example code (which probably wouldn't 
be produced by any currently existing compiler, but that could be written by 
hand in assembly?), I guess that's a single basic block for which 
`getAuthCheckedReg` would have to return 2 pairs?:
```
bb1:
autda x0, x1
autda x2, x3
ldr xzr, [x0]
ldr xzr, [x2]
```

I'm OK with not handling this case in this PR, but then maybe there should be a 
test case with a basic block similar to the above to show how BOLT will either 
produce an error message or otherwise fails to handle this case?

In any case, this PR should probably also document in 
https://github.com/llvm/llvm-project/blob/main/bolt/docs/BinaryAnalysis.md what 
the user of this analysis can expect in terms of false positives and true 
negatives.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-18 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls commented:

Thanks for the updates.
I managed to glance through the whole PR now.
I will do a more detailed review once my main currently remaining question is 
answered about which cases are supported and which ones are not supported. I 
think this should be written up as part of this PR in the user-facing 
documentation at bolt/docs/BinaryAnalysis.md .

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: use more appropriate types (NFC) (PR #135661)

2025-04-18 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls approved this pull request.

LGTM, thanks!

https://github.com/llvm/llvm-project/pull/135661
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: use more appropriate types (NFC) (PR #135661)

2025-04-18 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/135661
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits


@@ -622,6 +628,40 @@ class MCPlusBuilder {
 return std::make_pair(getNoRegister(), getNoRegister());
   }
 
+  /// Analyzes if a pointer is checked to be valid by the end of BB.
+  ///
+  /// It is possible for pointer authentication instructions not to terminate
+  /// the program abnormally on authentication failure and return some *invalid
+  /// pointer* instead (like it is done on AArch64 when FEAT_FPAC is not
+  /// implemented). This might be enough to crash on invalid memory access
+  /// when the pointer is later used as the destination of load/store or branch
+  /// instruction. On the other hand, when the pointer is not used right away,
+  /// it may be important for the compiler to check the address explicitly not
+  /// to introduce signing or authentication oracle.
+  ///
+  /// If this function returns a (Reg, Inst) pair, then it is known that in any
+  /// successor of BB either
+  /// * Reg is trusted, provided it was safe-to-dereference before Inst, or
+  /// * the program is terminated abnormally without introducing any signing
+  ///   or authentication oracles
+  virtual std::optional>

kbeyls wrote:

Thank you for those improved comments and the test case, that makes a lot more 
sense to me now!

One thing I'm still wondering about before resolving this review thread, is if 
you have any thoughts about whether any instruction patterns that guarantee the 
program will terminate on an authentication fault might not be detected by 
either of the 2 versions of `getAuthCheckedReg`?

It makes sense to update the documentation in a separate PR, but it might be 
useful to record the answer to the above question in a review comment to later 
help write up in the documentation what classes of false positive/negatives the 
user may still expect to see?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-18 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits


@@ -307,8 +340,10 @@ class SrcSafetyAnalysis {
 
   SrcState createEntryState() {
 SrcState S(NumRegs, RegsToTrackInstsFor.getNumTrackedRegisters());
-for (MCPhysReg Reg : BC.MIB->getTrustedLiveInRegs())
-  S.SafeToDerefRegs |= BC.MIB->getAliases(Reg, /*OnlySmaller=*/true);
+for (MCPhysReg Reg : BC.MIB->getTrustedLiveInRegs()) {
+  S.TrustedRegs |= BC.MIB->getAliases(Reg, /*OnlySmaller=*/true);
+  S.SafeToDerefRegs = S.TrustedRegs;
+}

kbeyls wrote:

Nit pick: I guess
`S.SafeToDerefRegs = S.TrustedRegs` could be move to after the loop, rather 
than inside the loop?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits


@@ -339,6 +369,183 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back.
+//
+// Note that this function is not needed for authentication oracles, as the
+  

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls commented:

Just adding a few more comments as I'm reading through the patch.
I haven't yet managed to read the full patch in detail yet

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits


@@ -339,6 +369,183 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back.
+//
+// Note that this function is not needed for authentication oracles, as the
+  

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-22 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-28 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls approved this pull request.

Thanks for your patience with my many questions!
This looks good to merge to me now.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-28 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-28 Thread Kristof Beyls via llvm-branch-commits


@@ -462,7 +563,22 @@ class DataflowSrcSafetyAnalysis
 return DFParent::getStateBefore(Inst);
   }
 
-  void run() override { DFParent::run(); }
+  void run() override {
+for (BinaryBasicBlock &BB : Func) {
+  if (auto CheckerInfo = BC.MIB->getAuthCheckedReg(BB)) {
+MCInst *LastInstOfChecker = BB.getLastNonPseudoInstr();
+LLVM_DEBUG({
+  dbgs() << "Found pointer checking sequence in " << BB.getName()
+ << ":\n";
+  traceReg(BC, "Checked register", CheckerInfo->first);
+  traceInst(BC, "First instruction", *CheckerInfo->second);
+  traceInst(BC, "Last instruction", *LastInstOfChecker);
+});
+CheckerSequenceInfo[LastInstOfChecker] = *CheckerInfo;
+  }
+}

kbeyls wrote:

Fair enough, let's leave it as it is.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-28 Thread Kristof Beyls via llvm-branch-commits


@@ -591,7 +591,9 @@ obscure_indirect_call_arg_nocfg:
 .globl  safe_lr_at_function_entry_nocfg
 .type   safe_lr_at_function_entry_nocfg,@function
 safe_lr_at_function_entry_nocfg:
-// CHECK-NOT: safe_lr_at_function_entry_nocfg
+// Due to state being reset after a label, paciasp is reported as
+// a signing oracle - this is a known false positive, ignore it.
+// CHECK-NOT: non-protected call{{.*}}safe_lr_at_function_entry_nocfg
 cbz x0, 1f
 ret// LR is safe at the start of the 
function
 1:

kbeyls wrote:

Thanks, that's useful info to know!
FWIW, my experience on pac-ret is that most code generated by the compiler 
follows mostly the same very regular structure, so I'm not surprised that 
you're not getting many false positives on llvm-test-suite.
In my experience with pac-ret scanning, you get most false positives when 
scanning across a full distribution, on pieces of code that were not generated 
by a popular compiler, such as hand-written assembly...

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-28 Thread Kristof Beyls via llvm-branch-commits


@@ -355,6 +389,46 @@ class SrcSafetyAnalysis {
 return Regs;
   }
 
+  // Returns all registers made trusted by this instruction.
+  SmallVector getRegsMadeTrusted(const MCInst &Point,
+const SrcState &Cur) const {
+SmallVector Regs;
+const MCPhysReg NoReg = BC.MIB->getNoRegister();
+
+// An authenticated pointer can be checked, or
+MCPhysReg CheckedReg =
+BC.MIB->getAuthCheckedReg(Point, /*MayOverwrite=*/false);
+if (CheckedReg != NoReg && Cur.SafeToDerefRegs[CheckedReg])
+  Regs.push_back(CheckedReg);
+
+if (CheckerSequenceInfo.contains(&Point)) {
+  MCPhysReg CheckedReg;
+  const MCInst *FirstCheckerInst;
+  std::tie(CheckedReg, FirstCheckerInst) = CheckerSequenceInfo.at(&Point);
+
+  // FirstCheckerInst should belong to the same basic block, meaning
+  // it was deterministically processed a few steps before this 
instruction.
+  const SrcState &StateBeforeChecker =
+  getStateBefore(*FirstCheckerInst).get();

kbeyls wrote:

Thanks, that all makes sense.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-23 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-23 Thread Kristof Beyls via llvm-branch-commits


@@ -339,6 +369,183 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back.
+//
+// Note that this function is not needed for authentication oracles, as the
+  

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-23 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls commented:

Apologies for only reviewing piece-meal. I'm struggling a bit at the time to 
reserve longer blocks of time to review this fully in one go.
I hope my comments still make sense though

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: analyze functions without CFG information (PR #133461)

2025-04-10 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls approved this pull request.

LGTM, thanks!

https://github.com/llvm/llvm-project/pull/133461
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-23 Thread Kristof Beyls via llvm-branch-commits


@@ -339,6 +369,198 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  std::optional>
+  getAuthCheckedReg(BinaryBasicBlock &BB) const override {
+// Match several possible hard-coded sequences of instructions which can be
+// emitted by LLVM backend to check that the authenticated pointer is
+// correct (see AArch64AsmPrinter::emitPtrauthCheckAuthenticatedValue).
+//
+// This function only matches sequences involving branch instructions.
+// All these sequences have the form:
+//
+// (0) ... regular code that authenticates a pointer in Xn ...
+// (1) analyze Xn
+// (2) branch to .Lon_success if the pointer is correct
+// (3) BRK #imm (fall-through basic block)
+//
+// In the above pseudocode, (1) + (2) is one of the following sequences:
+//
+// - eor Xtmp, Xn, Xn, lsl #1
+//   tbz Xtmp, #62, .Lon_success
+//
+// - mov Xtmp, Xn
+//   xpac(i|d) Xn (or xpaclri if Xn is LR)
+//   cmp Xtmp, Xn
+//   b.eq .Lon_success
+//
+// Note that any branch destination operand is accepted as .Lon_success -
+// it is the responsibility of the caller of getAuthCheckedReg to inspect
+// the list of successors of this basic block as appropriate.
+
+// Any of the above code sequences assume the fall-through basic block
+// is a dead-end BRK instruction (any immediate operand is accepted).
+const BinaryBasicBlock *BreakBB = BB.getFallthrough();
+if (!BreakBB || BreakBB->empty() ||
+BreakBB->front().getOpcode() != AArch64::BRK)
+  return std::nullopt;
+
+// Iterate over the instructions of BB in reverse order, matching opcodes
+// and operands.
+MCPhysReg TestedReg = 0;
+MCPhysReg ScratchReg = 0;
+auto It = BB.end();
+auto StepAndGetOpcode = [&It, &BB]() -> int {
+  if (It == BB.begin())
+return -1;
+  --It;
+  return It->getOpcode();
+};
+
+switch (StepAndGetOpcode()) {
+default:
+  // Not matched the branch instruction.
+  return std::nullopt;
+case AArch64::Bcc:
+  // Bcc EQ, .Lon_success
+  if (It->getOperand(0).getImm() != AArch64CC::EQ)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // SUBSXrs XZR, TestedReg, ScratchReg, 0 (used by "CMP reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::SUBSXrs ||
+  It->getOperand(0).getReg() != AArch64::XZR ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  ScratchReg = It->getOperand(2).getReg();
+
+  // Either XPAC(I|D) ScratchReg, ScratchReg
+  // or XPACLRI
+  switch (StepAndGetOpcode()) {
+  default:
+return std::nullopt;
+  case AArch64::XPACLRI:
+// No operands to check, but using XPACLRI forces TestedReg to be X30.
+if (TestedReg != AArch64::LR)
+  return std::nullopt;
+break;
+  case AArch64::XPACI:
+  case AArch64::XPACD:
+if (It->getOperand(0).getReg() != ScratchReg ||
+It->getOperand(1).getReg() != ScratchReg)
+  return std::nullopt;
+break;
+  }
+
+  // ORRXrs ScratchReg, XZR, TestedReg, 0 (used by "MOV reg, reg" alias)
+  if (StepAndGetOpcode() != AArch64::ORRXrs)
+return std::nullopt;
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(1).getReg() != AArch64::XZR ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 0)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+
+case AArch64::TBZX:
+  // TBZX ScratchReg, 62, .Lon_success
+  ScratchReg = It->getOperand(0).getReg();
+  if (It->getOperand(1).getImm() != 62)
+return std::nullopt;
+  // Not checking .Lon_success (see above).
+
+  // EORXrs ScratchReg, TestedReg, TestedReg, 1
+  if (StepAndGetOpcode() != AArch64::EORXrs)
+return std::nullopt;
+  TestedReg = It->getOperand(1).getReg();
+  if (It->getOperand(0).getReg() != ScratchReg ||
+  It->getOperand(2).getReg() != TestedReg ||
+  It->getOperand(3).getImm() != 1)
+return std::nullopt;
+
+  return std::make_pair(TestedReg, &*It);
+}
+  }
+
+  MCPhysReg getAuthCheckedReg(const MCInst &Inst,
+  bool MayOverwrite) const override {
+// Cannot trivially reuse AArch64InstrInfo::getMemOperandWithOffsetWidth()
+// method as it accepts an instance of MachineInstr, not MCInst.
+const MCInstrDesc &Desc = Info->get(Inst.getOpcode());
+
+// If signing oracles are considered, the particular value left in the base
+// register after this instruction is important. This function checks that
+// if the base register was overwritten, it is due to address write-back:
+//
+// ; good:
+// autdza  x1   ; x1 is authenticated (may fail

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-25 Thread Kristof Beyls via llvm-branch-commits


@@ -462,7 +563,22 @@ class DataflowSrcSafetyAnalysis
 return DFParent::getStateBefore(Inst);
   }
 
-  void run() override { DFParent::run(); }
+  void run() override {
+for (BinaryBasicBlock &BB : Func) {
+  if (auto CheckerInfo = BC.MIB->getAuthCheckedReg(BB)) {
+MCInst *LastInstOfChecker = BB.getLastNonPseudoInstr();
+LLVM_DEBUG({
+  dbgs() << "Found pointer checking sequence in " << BB.getName()
+ << ":\n";
+  traceReg(BC, "Checked register", CheckerInfo->first);
+  traceInst(BC, "First instruction", *CheckerInfo->second);
+  traceInst(BC, "Last instruction", *LastInstOfChecker);
+});
+CheckerSequenceInfo[LastInstOfChecker] = *CheckerInfo;
+  }
+}

kbeyls wrote:

Another nit-pick.
This to me looks like it's initializing the `CheckerSequenceInfo` variable of 
the `SrcSafetyAnalysis` parent class.
Wouldn't it be cleaner to do this initializing in the constructor for 
`SrcSafetyAnalysis`, rather than in this `run` method?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-25 Thread Kristof Beyls via llvm-branch-commits


@@ -591,7 +591,9 @@ obscure_indirect_call_arg_nocfg:
 .globl  safe_lr_at_function_entry_nocfg
 .type   safe_lr_at_function_entry_nocfg,@function
 safe_lr_at_function_entry_nocfg:
-// CHECK-NOT: safe_lr_at_function_entry_nocfg
+// Due to state being reset after a label, paciasp is reported as
+// a signing oracle - this is a known false positive, ignore it.
+// CHECK-NOT: non-protected call{{.*}}safe_lr_at_function_entry_nocfg
 cbz x0, 1f
 ret// LR is safe at the start of the 
function
 1:

kbeyls wrote:

[Re: lines 
+594 to +600]
I'm wondering if this false positive pattern could end up appearing quite a few 
times in real code, specifically in code that has been shrink-wrap optimized?
Did you run this scanner on a larger code base? How many and what kind of false 
positives did you see?
See this comment 
inline on https://app.graphite.dev/github/pr/llvm/llvm-project/134146?utm_source=unchanged-line-comment";>Graphite.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-25 Thread Kristof Beyls via llvm-branch-commits


@@ -355,6 +389,46 @@ class SrcSafetyAnalysis {
 return Regs;
   }
 
+  // Returns all registers made trusted by this instruction.
+  SmallVector getRegsMadeTrusted(const MCInst &Point,
+const SrcState &Cur) const {
+SmallVector Regs;
+const MCPhysReg NoReg = BC.MIB->getNoRegister();
+
+// An authenticated pointer can be checked, or
+MCPhysReg CheckedReg =
+BC.MIB->getAuthCheckedReg(Point, /*MayOverwrite=*/false);
+if (CheckedReg != NoReg && Cur.SafeToDerefRegs[CheckedReg])
+  Regs.push_back(CheckedReg);
+
+if (CheckerSequenceInfo.contains(&Point)) {
+  MCPhysReg CheckedReg;
+  const MCInst *FirstCheckerInst;
+  std::tie(CheckedReg, FirstCheckerInst) = CheckerSequenceInfo.at(&Point);
+
+  // FirstCheckerInst should belong to the same basic block, meaning
+  // it was deterministically processed a few steps before this 
instruction.
+  const SrcState &StateBeforeChecker =
+  getStateBefore(*FirstCheckerInst).get();

kbeyls wrote:

This is a nitpick.
I was trying to get my head around whether it's guaranteed to get to a fixed 
point in a dataflow analysis, where next to using just the previous state on 
the current instruction, also a state on another instruction is used as input 
to compute the next state for the current instruction.
I'm assuming this is probably fine (probably, because as you write in the 
comment, this instruction should be in the same basic block).
I thought I'd just ask if you ended up checking this somewhere else and, if so, 
you'd happen to have a pointer to something stating that you can also take 
state from another instruction in the same basic block, and still be guaranteed 
to reach a fixed point in a dataflow analysis?

Orthogonal to that: would it be hard to add an assert here that 
`*FirstCheckerInst` is indeed in the same basic block?

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-25 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls commented:

I finally managed to read through this patch end-to-end.
I only have 3 very nit-picky questions left.
This looks almost ready to merge.

https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect signing oracles (PR #134146)

2025-04-25 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/134146
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: use more appropriate types (NFC) (PR #135661)

2025-04-25 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls approved this pull request.


https://github.com/llvm/llvm-project/pull/135661
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: use more appropriate types (NFC) (PR #135661)

2025-04-25 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/135661
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: refactor issue reporting (PR #135662)

2025-04-29 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/135662
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: refactor issue reporting (PR #135662)

2025-04-29 Thread Kristof Beyls via llvm-branch-commits


@@ -198,73 +198,147 @@ raw_ostream &operator<<(raw_ostream &OS, const 
MCInstReference &);
 
 namespace PAuthGadgetScanner {
 
+// The report classes are designed to be used in an immutable manner.
+// When an issue report is constructed in multiple steps, an attempt is made
+// to distinguish intermediate and final results at the type level.
+//
+// Here is an overview of issue life-cycle:
+// * an analysis (SrcSafetyAnalysis at now, DstSafetyAnalysis will be added
+//   later to support the detection of authentication oracles) computes 
register
+//   state for each instruction in the function.
+// * each instruction is checked to be a gadget of some kind, taking the
+//   computed state into account. If a gadget is found, its kind and location
+//   are stored into a subclass of Diagnostic wrapped into BriefReport.

kbeyls wrote:

Reading it like this in the documentation, it seems to me that a name like 
`PartialReport` would slightly better describe what it is than `BriefReport`. 
WDYT?

https://github.com/llvm/llvm-project/pull/135662
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: refactor issue reporting (PR #135662)

2025-04-29 Thread Kristof Beyls via llvm-branch-commits


@@ -198,73 +198,147 @@ raw_ostream &operator<<(raw_ostream &OS, const 
MCInstReference &);
 
 namespace PAuthGadgetScanner {
 
+// The report classes are designed to be used in an immutable manner.
+// When an issue report is constructed in multiple steps, an attempt is made
+// to distinguish intermediate and final results at the type level.
+//
+// Here is an overview of issue life-cycle:
+// * an analysis (SrcSafetyAnalysis at now, DstSafetyAnalysis will be added
+//   later to support the detection of authentication oracles) computes 
register
+//   state for each instruction in the function.
+// * each instruction is checked to be a gadget of some kind, taking the

kbeyls wrote:

nitpick: maybe "For each instruction, it is checked whether it is a gadget of 
some kind"?

https://github.com/llvm/llvm-project/pull/135662
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: refactor issue reporting (PR #135662)

2025-04-29 Thread Kristof Beyls via llvm-branch-commits


@@ -198,73 +198,147 @@ raw_ostream &operator<<(raw_ostream &OS, const 
MCInstReference &);
 
 namespace PAuthGadgetScanner {
 
+// The report classes are designed to be used in an immutable manner.
+// When an issue report is constructed in multiple steps, an attempt is made
+// to distinguish intermediate and final results at the type level.
+//
+// Here is an overview of issue life-cycle:
+// * an analysis (SrcSafetyAnalysis at now, DstSafetyAnalysis will be added
+//   later to support the detection of authentication oracles) computes 
register
+//   state for each instruction in the function.
+// * each instruction is checked to be a gadget of some kind, taking the
+//   computed state into account. If a gadget is found, its kind and location
+//   are stored into a subclass of Diagnostic wrapped into BriefReport.
+// * if any issue is to be reported for the function, the same analysis is
+//   re-run to collect extra information to provide to the user. Which extra
+//   information can be requested depends on the particular analysis (for
+//   example, SrcSafetyAnalysis is able to compute the set of instructions
+//   clobbering the particular register, thus ReqT is MCPhysReg). At this 
stage,
+//   `FinalReport`s are created.
+//
+// Here, the subclasses of Diagnostic store the pieces of information which
+// are kept unchanged since they are collected on the first run of the 
analysis.
+// BriefReport::RequestedDetails, on the other hand, is replaced with
+// FinalReport::Details computed by the second run of the analysis.
+
 /// Description of a gadget kind that can be detected. Intended to be
-/// statically allocated to be attached to reports by reference.
+/// statically allocated and attached to reports by reference.
 class GadgetKind {
   const char *Description;
 
 public:
+  /// Wraps a description string which must be a string literal.
   GadgetKind(const char *Description) : Description(Description) {}
 
   StringRef getDescription() const { return Description; }
 };
 
-/// Base report located at some instruction, without any additional 
information.
-struct Report {
+/// Basic diagnostic information, which is kept unchanged since it is collected
+/// on the first run of the analysis.
+struct Diagnostic {
   MCInstReference Location;
 
-  Report(MCInstReference Location) : Location(Location) {}
-  virtual ~Report() {}
+  Diagnostic(MCInstReference Location) : Location(Location) {}
+  virtual ~Diagnostic() {}
 
   virtual void generateReport(raw_ostream &OS,
   const BinaryContext &BC) const = 0;
 
-  // The two methods below are called by Analysis::computeDetailedInfo when
-  // iterating over the reports.
-  virtual ArrayRef getAffectedRegisters() const { return {}; }
-  virtual void setOverwritingInstrs(ArrayRef Instrs) {}
-
   void printBasicInfo(raw_ostream &OS, const BinaryContext &BC,
   StringRef IssueKind) const;
 };
 
-struct GadgetReport : public Report {
+struct GadgetDiagnostic : public Diagnostic {
   // The particular kind of gadget that is detected.
   const GadgetKind &Kind;
-  // The set of registers related to this gadget report (possibly empty).
-  SmallVector AffectedRegisters;
-  // The instructions that clobber the affected registers.
-  // There is no one-to-one correspondence with AffectedRegisters: for example,
-  // the same register can be overwritten by different instructions in 
different
-  // preceding basic blocks.
-  SmallVector OverwritingInstrs;
-
-  GadgetReport(const GadgetKind &Kind, MCInstReference Location,
-   MCPhysReg AffectedRegister)
-  : Report(Location), Kind(Kind), AffectedRegisters({AffectedRegister}) {}
 
-  void generateReport(raw_ostream &OS, const BinaryContext &BC) const override;
+  GadgetDiagnostic(const GadgetKind &Kind, MCInstReference Location)
+  : Diagnostic(Location), Kind(Kind) {}
 
-  ArrayRef getAffectedRegisters() const override {
-return AffectedRegisters;
-  }
-
-  void setOverwritingInstrs(ArrayRef Instrs) override {
-OverwritingInstrs.assign(Instrs.begin(), Instrs.end());
-  }
+  void generateReport(raw_ostream &OS, const BinaryContext &BC) const override;
 };
 
 /// Report with a free-form message attached.
-struct GenericReport : public Report {
+struct GenericDiagnostic : public Diagnostic {
   std::string Text;
-  GenericReport(MCInstReference Location, StringRef Text)
-  : Report(Location), Text(Text) {}
+  GenericDiagnostic(MCInstReference Location, StringRef Text)
+  : Diagnostic(Location), Text(Text) {}
   virtual void generateReport(raw_ostream &OS,
   const BinaryContext &BC) const override;
 };
 
+/// An information about an issue collected on the slower, detailed,

kbeyls wrote:

"An information" sounds a bit strange to me. Maybe better to say "Extra 
information" here?

https://github.com/llvm/llvm-project/pull/135662
___

[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: refactor issue reporting (PR #135662)

2025-04-29 Thread Kristof Beyls via llvm-branch-commits


@@ -720,16 +713,30 @@ SrcSafetyAnalysis::create(BinaryFunction &BF,
RegsToTrackInstsFor);
 }
 
-static std::shared_ptr
+static BriefReport make_generic_report(MCInstReference Location,
+  StringRef Text) {
+  auto Report = std::make_shared(Location, Text);
+  return BriefReport(Report, std::nullopt);
+}
+
+template 
+static BriefReport make_report(const GadgetKind &Kind,
+  MCInstReference Location,

kbeyls wrote:

If the first function is called `make_generic_report`, would it be more 
consistent to call the second function `make_gadget_report`?

I started wondering if it'd be possible to unify the two functions into a more 
general templated function, something like
```
template 
static BriefReport make_brief_report(MCInstReference Location, DiagParam P, 
ReqDetailsT ReqDetails) {
  auto Report = std::make_shared(Location, P);
  return BriefReport(Report, ReqDetails);
}
```

and then it gets called like
`make_brief_report(Location, Text, 
std::nullopt)` or
`make_brief_report(Location, 
Kind, RequestedDetails)`
(sorry, haven't spent time thinking about reducing the number of explicit 
template parameters).

I'm guessing you might have thought about this and come to the conclusion that 
having two `make_report` functions is better?



https://github.com/llvm/llvm-project/pull/135662
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: refactor issue reporting (PR #135662)

2025-04-29 Thread Kristof Beyls via llvm-branch-commits


@@ -198,73 +198,147 @@ raw_ostream &operator<<(raw_ostream &OS, const 
MCInstReference &);
 
 namespace PAuthGadgetScanner {
 
+// The report classes are designed to be used in an immutable manner.
+// When an issue report is constructed in multiple steps, an attempt is made
+// to distinguish intermediate and final results at the type level.
+//
+// Here is an overview of issue life-cycle:
+// * an analysis (SrcSafetyAnalysis at now, DstSafetyAnalysis will be added
+//   later to support the detection of authentication oracles) computes 
register
+//   state for each instruction in the function.
+// * each instruction is checked to be a gadget of some kind, taking the
+//   computed state into account. If a gadget is found, its kind and location
+//   are stored into a subclass of Diagnostic wrapped into BriefReport.
+// * if any issue is to be reported for the function, the same analysis is
+//   re-run to collect extra information to provide to the user. Which extra
+//   information can be requested depends on the particular analysis (for

kbeyls wrote:

nitpick: maybe it's simpler and still correct to just phrase this as "Which 
extra information depends on the particular analysis"?

https://github.com/llvm/llvm-project/pull/135662
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: refactor issue reporting (PR #135662)

2025-04-29 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls commented:

I just have a few more nit-picky comments/questions, but overall looks good to 
me.

https://github.com/llvm/llvm-project/pull/135662
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: Detect address materialization and arithmetics (PR #132540)

2025-03-31 Thread Kristof Beyls via llvm-branch-commits


@@ -587,6 +587,22 @@ class MCPlusBuilder {
 return getNoRegister();
   }
 
+  virtual MCPhysReg getSafelyMaterializedAddressReg(const MCInst &Inst) const {
+llvm_unreachable("not implemented");
+return getNoRegister();
+  }

kbeyls wrote:

I think that "safely materialized address register" is not a common term and 
different people seeing it are likely to have somewhat different guesses at 
what it might mean.
Therefore, this method probably needs a comment that explains what a "safely 
materialized address register" is.

https://github.com/llvm/llvm-project/pull/132540
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: Detect address materialization and arithmetics (PR #132540)

2025-03-31 Thread Kristof Beyls via llvm-branch-commits


@@ -335,6 +335,50 @@ class PacRetAnalysis
 });
   }
 
+  BitVector getClobberedRegs(const MCInst &Point) const {
+BitVector Clobbered(NumRegs, false);
+// Assume a call can clobber all registers, including callee-saved
+// registers. There's a good chance that callee-saved registers will be
+// saved on the stack at some point during execution of the callee.
+// Therefore they should also be considered as potentially modified by an
+// attacker/written to.
+// Also, not all functions may respect the AAPCS ABI rules about
+// caller/callee-saved registers.
+if (BC.MIB->isCall(Point))
+  Clobbered.set();
+else
+  BC.MIB->getClobberedRegs(Point, Clobbered);
+return Clobbered;
+  }
+
+  // Returns all registers that can be treated as if they are written by an
+  // authentication instruction.
+  SmallVector getAuthenticatedRegs(const MCInst &Point,

kbeyls wrote:

Since this function is being changed to no longer return strictly only the 
registers that are authenticated by the instruction in `Point`, I think it 
would be best to adjust the name of this function accordingly.

I'm not sure I can easily come up with a better name. Would any of the 
following names be better?
- `getSafeToDerefRegs`
- `getNonAttackerControlledRegs`
or maybe
- `getSafeRegsWrittenBy` (since this method only returns registers written by 
`Point`?)

https://github.com/llvm/llvm-project/pull/132540
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: Detect address materialization and arithmetics (PR #132540)

2025-03-31 Thread Kristof Beyls via llvm-branch-commits


@@ -587,6 +587,22 @@ class MCPlusBuilder {
 return getNoRegister();
   }
 
+  virtual MCPhysReg getSafelyMaterializedAddressReg(const MCInst &Inst) const {
+llvm_unreachable("not implemented");
+return getNoRegister();
+  }
+
+  /// Analyzes if this instruction can safely perform address arithmetics.

kbeyls wrote:

Similar here, I think a definition is needed for what "safely performing 
address arithmetics" means here.

I'm assuming that "safe" here is in the context of a particular threat model.
Different threat models may require different definitions of "safe". In other 
words, what is "safe" under one threat model, might not be safe under another 
threat model.

I think that the threat model should be described as accurately as possible in 
at least the comment documenting these methods.

Is my guess correct that roughly the threat model is "The assumption is that 
values stored in data memory are 'unsafe', because the attackers under our 
threat model (question: can we point to a written up threat model somewhere?) 
are assumed to be able to change values in writeable data memory. In contrast, 
values in code memory or in registers are assumed to not be changeable by an 
attacker".

https://github.com/llvm/llvm-project/pull/132540
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: Detect address materialization and arithmetics (PR #132540)

2025-03-31 Thread Kristof Beyls via llvm-branch-commits


@@ -1,7 +1,8 @@
 if "AArch64" not in config.root.targets:
 config.unsupported = True
 
-flags = "--target=aarch64-linux-gnu -nostartfiles -nostdlib -ffreestanding"
+# -Wl,--no-relax prevents converting ADRP+ADD pairs into NOP+ADR.
+flags = "--target=aarch64-linux-gnu -nostartfiles -nostdlib -ffreestanding 
-Wl,--no-relax"

kbeyls wrote:

I'm not entirely sure, but maybe we should only add `-Wl,--no-relax` to test 
case gs-pauth-address-materialization.s (and other test cases that require it 
in the future)?
If we add too many special flags to the default flags for all the binary 
analysis test cases, we might end up running all test cases accidentally under 
some flags that are too far removed from the typical flags used to build 
production software?

https://github.com/llvm/llvm-project/pull/132540
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: Detect address materialization and arithmetics (PR #132540)

2025-03-31 Thread Kristof Beyls via llvm-branch-commits


@@ -355,37 +399,39 @@ class PacRetAnalysis
   return State();
 }
 
+// First, compute various properties of the instruction, taking the state
+// before its execution into account, if necessary.
+
+BitVector Clobbered = getClobberedRegs(Point);
+// Compute the set of registers that can be considered as written by
+// an authentication instruction. This includes operations that are
+// *strictly better* than authentication, such as materializing a
+// PC-relative constant.

kbeyls wrote:

See my comment above, I hope we can come up with a better name than 
`AuthenticatedOrBetter`.

https://github.com/llvm/llvm-project/pull/132540
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] compiler-rt: Introduce runtime functions for emulated PAC. (PR #133530)

2025-03-28 Thread Kristof Beyls via llvm-branch-commits


@@ -0,0 +1,115 @@
+#include 

kbeyls wrote:

This new file needs a license header, I guess?

https://github.com/llvm/llvm-project/pull/133530
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Make DataflowAnalysis::getStateBefore() const (NFC) (PR #133308)

2025-03-28 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls approved this pull request.


https://github.com/llvm/llvm-project/pull/133308
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Make DataflowAnalysis::getStateBefore() const (NFC) (PR #133308)

2025-03-28 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/133308
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect non-protected indirect calls (PR #131899)

2025-03-25 Thread Kristof Beyls via llvm-branch-commits


@@ -0,0 +1,676 @@
+// RUN: %clang %cflags -march=armv8.3-a %s -o %t.exe
+// RUN: llvm-bolt-binary-analysis --scanners=pacret %t.exe 2>&1 | FileCheck %s

kbeyls wrote:

I'm wondering if the user interface for this should be adapted?

`pac-ret` is a widely deployed hardening scheme, so just verifying the correct 
application of pac-ret hardening is something that users presumably want to do, 
without also checking pauthabi hardening guarantees that do not exist in 
pac-ret?

Maybe the checking of non-protected indirect calls should happen under a 
different `--scanners` option?

https://github.com/llvm/llvm-project/pull/131899
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect non-protected indirect calls (PR #131899)

2025-03-25 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls commented:

I haven't had time yet to review the test cases, but I thought I'd share my 
comments so far already.

https://github.com/llvm/llvm-project/pull/131899
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect non-protected indirect calls (PR #131899)

2025-03-25 Thread Kristof Beyls via llvm-branch-commits


@@ -0,0 +1,676 @@
+// RUN: %clang %cflags -march=armv8.3-a %s -o %t.exe
+// RUN: llvm-bolt-binary-analysis --scanners=pacret %t.exe 2>&1 | FileCheck %s
+
+// FIXME In the below test cases, LR is usually not spilled as needed, as it is
+//   not checked by BOLT.

kbeyls wrote:

I'm wondering if this needs to be a FIXME, or simply a note to explain that the 
test cases are simplified/artificial?
I don't think I've got a strong opinion either way. Presumably, it'd be better 
for the test cases to be closer to "real-world" code, so ideally, they should 
have the ldp/stp spill/fill instructions?

https://github.com/llvm/llvm-project/pull/131899
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect non-protected indirect calls (PR #131899)

2025-03-25 Thread Kristof Beyls via llvm-branch-commits


@@ -277,6 +277,48 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 }
   }
 
+  MCPhysReg
+  getRegUsedAsCallDest(const MCInst &Inst,
+   bool &IsAuthenticatedInternally) const override {

kbeyls wrote:

I'm wondering if this could be adapt so that it only needs to handle indirect 
calls?
That would make the switch statement simpler, and also easier to maintain, 
because it won't need to handle all branch instructions.

For example, at the moment, it seems the switch statement is not handling the 
newly introduced (in armv9.5) Compare and Branch instructions, see 
[https://developer.arm.com/documentation/ddi0602/2024-09/Base-Instructions/CB-cc---register---Compare-registers-and-branch-](https://developer.arm.com/documentation/ddi0602/2024-12/Base-Instructions/CB-cc---register---Compare-registers-and-branch-)

My understanding is that only indirect calls need to be checked, not direct 
calls.

https://github.com/llvm/llvm-project/pull/131899
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect non-protected indirect calls (PR #131899)

2025-03-25 Thread Kristof Beyls via llvm-branch-commits


@@ -382,11 +382,11 @@ class PacRetAnalysis
 
 public:
   std::vector
-  getLastClobberingInsts(const MCInst Ret, BinaryFunction &BF,
- const ArrayRef UsedDirtyRegs) const {

kbeyls wrote:

I was wondering whether `const` has to be removed here.
Looking at the implementation of `getStateAt` and `getStateBefore`, it seems 
that it might be better to add the `const` qualifier to `getStateBefore` at 
https://github.com/llvm/llvm-project/blob/9768077de65e31daa619eae231f027e052d601c2/bolt/include/bolt/Passes/DataflowAnalysis.h#L295?

https://github.com/llvm/llvm-project/pull/131899
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: Detect address materialization and arithmetics (PR #132540)

2025-04-03 Thread Kristof Beyls via llvm-branch-commits


@@ -587,6 +587,22 @@ class MCPlusBuilder {
 return getNoRegister();
   }
 
+  virtual MCPhysReg getSafelyMaterializedAddressReg(const MCInst &Inst) const {
+llvm_unreachable("not implemented");
+return getNoRegister();
+  }

kbeyls wrote:

Thanks, it's probably still a bit complicated, but with the comment it's 
possible for a reader to dig in and understand what the assumed threat model is.

On the comment
```
  /// Returns the register containing an address which is safely materialized
  /// under Pointer Authentication threat model, or NoRegister otherwise.
  ///
  /// The produced address should not be attacker-controlled, assuming an
  /// attacker is able to modify any writable memory, but not executable code
  /// (as it should be W^X).
```
I think that the first sentence could be improved a little bit by also 
explicitly stating that it is the register materialized by `Inst` (the first 
argument to the function). Maybe to something like the following?
```
Returns the register `Inst` writes to if:
1. the register is a materialized address, and
2. the register has been materialized safely, i.e. cannot be 
attacker-controlled, under the Pointer Authentication threat model.
If the instruction does not write to any register satisfying the above 2 
conditions, NoRegister is returned.

The Pointer Authentication threat model assumes code is not writeable (W^X), 
but data memory may be written to by an attacker ("is attracker-controlled")
```

https://github.com/llvm/llvm-project/pull/132540
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: do not crash on debug-printing CFI instructions (PR #136151)

2025-05-28 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls approved this pull request.


https://github.com/llvm/llvm-project/pull/136151
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: do not crash on debug-printing CFI instructions (PR #136151)

2025-05-28 Thread Kristof Beyls via llvm-branch-commits

https://github.com/kbeyls edited 
https://github.com/llvm/llvm-project/pull/136151
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: detect authentication oracles (PR #135663)

2025-05-22 Thread Kristof Beyls via llvm-branch-commits


@@ -717,6 +716,466 @@ SrcSafetyAnalysis::create(BinaryFunction &BF,
RegsToTrackInstsFor);
 }
 
+/// A state representing which registers are safe to be used as the destination
+/// operand of an authentication instruction.
+///
+/// Similar to SrcState, it is the analysis that should take register aliasing
+/// into account.
+///

kbeyls wrote:

At first sight, it looks like there is quite a bit of code added here for 
`DstState` and `DstSafetyAnalysis`. I wonder if you have thought about/analyzed 
if it would be (a) possible and (b) desirable to try and share more of the 
implementation between the `Dst...` and `Src...` classes?
I haven't investigated this at all --- just thought I'd ask you first in case 
you had thought about this before.

https://github.com/llvm/llvm-project/pull/135663
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: clarify MCPlusBuilder callbacks interface (PR #136147)

2025-05-22 Thread Kristof Beyls via llvm-branch-commits


@@ -985,6 +985,26 @@ inst_pacnbibsppc:
 ret
 .size inst_pacnbibsppc, .-inst_pacnbibsppc
 
+// Test that write-back forms of LDRA(A|B) instructions are handled properly.
+
+.globl  inst_ldraa_wb
+.type   inst_ldraa_wb,@function
+inst_ldraa_wb:
+// CHECK-NOT: inst_ldraa_wb
+ldraa   x2, [x0]!
+pacda   x0, x1
+ret
+.size inst_ldraa_wb, .-inst_ldraa_wb
+
+.globl  inst_ldrab_wb
+.type   inst_ldrab_wb,@function
+inst_ldrab_wb:
+// CHECK-NOT: inst_ldrab_wb
+ldraa   x2, [x0]!
+pacda   x0, x1
+ret
+.size inst_ldrab_wb, .-inst_ldrab_wb
+

kbeyls wrote:

Quoting from the ArmARM description for LDRAA, LDRAB:
> The authenticated address is not written back to the base register, unless 
> the pre-indexed variant of the instruction is used.
In this case, the address that is written back to the base register does not 
include the pointer authentication code.

Given whether the base register is authenticated after executing `ldra{ab}` 
depends on whether the pre-index addressing mode is used, I'm wondering if it 
would be a good idea to also add a test that shows that the scanner understands 
that after executing `ldraa x2, [x0]` `x0` contains a signed address, not an 
authenticated address?
Could this make a difference to any of the analysis implemented so far?

https://github.com/llvm/llvm-project/pull/136147
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Gadget scanner: clarify MCPlusBuilder callbacks interface (PR #136147)

2025-05-22 Thread Kristof Beyls via llvm-branch-commits


@@ -562,35 +562,55 @@ class MCPlusBuilder {
 return {};
   }
 
-  virtual ErrorOr getAuthenticatedReg(const MCInst &Inst) const {
-llvm_unreachable("not implemented");
-return getNoRegister();
-  }
-
-  virtual bool isAuthenticationOfReg(const MCInst &Inst,
- MCPhysReg AuthenticatedReg) const {
+  /// Returns the register where an authenticated pointer is written to by 
Inst,
+  /// or std::nullopt if not authenticating any register.
+  ///
+  /// Sets IsChecked if the instruction always checks authenticated pointer,
+  /// i.e. it either writes a successfully authenticated pointer or terminates
+  /// the program abnormally (such as "ldra x0, [x1]!" on AArch64, which 
crashes
+  /// on authentication failure even if FEAT_FPAC is not implemented).
+  virtual std::optional
+  getWrittenAuthenticatedReg(const MCInst &Inst, bool &IsChecked) const {
 llvm_unreachable("not implemented");
-return false;
+return std::nullopt;
   }
 
-  virtual MCPhysReg getSignedReg(const MCInst &Inst) const {
+  /// Returns the register signed by Inst, or std::nullopt if not signing any
+  /// register.
+  ///
+  /// The returned register is assumed to be both input and output operand,
+  /// as it is done on AArch64.
+  virtual std::optional getSignedReg(const MCInst &Inst) const {
 llvm_unreachable("not implemented");
-return getNoRegister();
+return std::nullopt;
   }
 
-  virtual ErrorOr getRegUsedAsRetDest(const MCInst &Inst) const {
+  /// Returns the register used as a return address. Returns std::nullopt if
+  /// not applicable, such as reading the return address from a system register
+  /// or from the stack.
+  ///
+  /// Sets IsAuthenticatedInternally if the instruction accepts a signed
+  /// pointer as its operand and authenticates it internally.
+  ///
+  /// Should only be called when isReturn(Inst) is true.
+  virtual std::optional
+  getRegUsedAsRetDest(const MCInst &Inst,
+  bool &IsAuthenticatedInternally) const {
 llvm_unreachable("not implemented");
-return getNoRegister();
+return std::nullopt;
   }
 
   /// Returns the register used as the destination of an indirect branch or 
call
   /// instruction. Sets IsAuthenticatedInternally if the instruction accepts
   /// a signed pointer as its operand and authenticates it internally.
+  ///
+  /// Should only be called if isIndirectCall(Inst) or isIndirectBranch(Inst)
+  /// returns true.
   virtual MCPhysReg
   getRegUsedAsIndirectBranchDest(const MCInst &Inst,
  bool &IsAuthenticatedInternally) const {
 llvm_unreachable("not implemented");
-return getNoRegister();
+return 0;

kbeyls wrote:

I'm trying to understand why returning `0` is better than `getNoRegister()` 
here?
I don't have a strong opinion, just that I'm not immediately seeing why `0` 
would be better?

https://github.com/llvm/llvm-project/pull/136147
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


  1   2   >