[llvm-branch-commits] [clang] [clang] Implement pointer authentication for C++ virtual functions, v-tables, and VTTs. (PR #93907)
ojhunt wrote: looking as well https://github.com/llvm/llvm-project/pull/93907 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Implement pointer authentication for C++ virtual functions, v-tables, and VTTs. (PR #93907)
ojhunt wrote: Build fix is trivial `diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp index 82c4a3c86645..e9a867ff67ba 100644 --- a/clang/lib/Sema/SemaDeclAttr.cpp +++ b/clang/lib/Sema/SemaDeclAttr.cpp @@ -9246,7 +9246,7 @@ static void handleVTablePointerAuthentication(Sema &S, Decl *D, AL.setInvalid(); } -if (!AL.isArgExpr(3) || !S.checkUInt32Argument(AL, AL.getArgAsExpr(3), +if (!AL.isArgExpr(3) || !checkUInt32Argument(S, AL, AL.getArgAsExpr(3), customDiscriminationValue)) { S.Diag(AL.getLoc(), diag::err_invalid_custom_discrimination); AL.setInvalid();` Just the original PR being merged is dates to earlier checkUInt32Argument API https://github.com/llvm/llvm-project/pull/93907 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Update use of checkUInt32Argument to match on main (PR #93984)
https://github.com/ojhunt created https://github.com/llvm/llvm-project/pull/93984 Minor correction to match current API >From 0262fdfddf50853d2f40ea86c37877168ad070a8 Mon Sep 17 00:00:00 2001 From: Oliver Hunt <4691426+ojh...@users.noreply.github.com> Date: Fri, 31 May 2024 09:36:32 -0700 Subject: [PATCH] [clang] Update use of checkUInt32Argument to match on main Minor correction to match current API --- clang/lib/Sema/SemaDeclAttr.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp index 82c4a3c866458..e9a867ff67ba1 100644 --- a/clang/lib/Sema/SemaDeclAttr.cpp +++ b/clang/lib/Sema/SemaDeclAttr.cpp @@ -9246,7 +9246,7 @@ static void handleVTablePointerAuthentication(Sema &S, Decl *D, AL.setInvalid(); } -if (!AL.isArgExpr(3) || !S.checkUInt32Argument(AL, AL.getArgAsExpr(3), +if (!AL.isArgExpr(3) || !checkUInt32Argument(S, AL, AL.getArgAsExpr(3), customDiscriminationValue)) { S.Diag(AL.getLoc(), diag::err_invalid_custom_discrimination); AL.setInvalid(); ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Implement pointer authentication for C++ virtual functions, v-tables, and VTTs (PR #94054)
https://github.com/ojhunt converted_to_draft https://github.com/llvm/llvm-project/pull/94054 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Implement pointer authentication for C++ virtual functions, v-tables, and VTTs (PR #94054)
https://github.com/ojhunt closed https://github.com/llvm/llvm-project/pull/94054 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Implement pointer authentication for C++ virtual functions, v-tables, and VTTs (PR #94056)
https://github.com/ojhunt ready_for_review https://github.com/llvm/llvm-project/pull/94056 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Implement pointer authentication for C++ virtual functions, v-tables, and VTTs (PR #94056)
@@ -1261,6 +1262,10 @@ class ASTContext : public RefCountedBase { /// space. QualType removeAddrSpaceQualType(QualType T) const; + /// Return the "other" type-specific discriminator for the given type. ojhunt wrote: @asl thoughts? https://github.com/llvm/llvm-project/pull/94056 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Implement pointer authentication for C++ virtual functions, v-tables, and VTTs (PR #94056)
@@ -1261,6 +1262,10 @@ class ASTContext : public RefCountedBase { /// space. QualType removeAddrSpaceQualType(QualType T) const; + /// Return the "other" type-specific discriminator for the given type. ojhunt wrote: How would `/// Return the "other" discriminator used for the pointer auth schema used for vtable pointers in instances of the requested type` sound? it's more explicit about what is being produced https://github.com/llvm/llvm-project/pull/94056 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Implement pointer authentication for C++ virtual functions, v-tables, and VTTs (PR #94056)
ojhunt wrote: Had to do a force push to resolve merge conflicts following @ahatanak's PR, so this change now includes `CodeGenFunction::EmitPointerAuthAuth` etc https://github.com/llvm/llvm-project/pull/94056 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Implement pointer authentication for C++ virtual functions, v-tables, and VTTs (PR #94056)
@@ -7037,8 +7036,64 @@ void ItaniumMangleContextImpl::mangleCXXDtorComdat(const CXXDestructorDecl *D, Mangler.mangle(GlobalDecl(D, Dtor_Comdat)); } +static void mangleOverrideDiscrimination(CXXNameMangler &mangler, + ASTContext &context, + const ThunkInfo &thunk) { + auto &langOpts = context.getLangOpts(); + auto thisType = thunk.ThisType; + auto thisRecord = thisType->getPointeeCXXRecordDecl(); + auto ptrauthClassRecord = context.baseForVTableAuthentication(thisRecord); + unsigned typedDiscriminator = + context.getPointerAuthVTablePointerDiscriminator(thisRecord); + mangler.mangleVendorQualifier("__vtptrauth"); + auto &manglerStream = mangler.getStream(); + manglerStream << "I"; + if (auto explicitAuth = + ptrauthClassRecord->getAttr()) { +manglerStream << "Lj" << explicitAuth->getKey(); ojhunt wrote: I'll see what I can find https://github.com/llvm/llvm-project/pull/94056 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Implement pointer authentication for C++ virtual functions, v-tables, and VTTs (PR #94056)
@@ -296,3 +296,21 @@ ConstantAggregateBuilderBase::finishStruct(llvm::StructType *ty) { buffer.erase(buffer.begin() + Begin, buffer.end()); return constant; } + ojhunt wrote: @asl Updating to resolve conflict required bringing this function in that was previously in Akira's work https://github.com/llvm/llvm-project/pull/94056 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Remove FEM_Indeterminable (#137661) (PR #138337)
https://github.com/ojhunt created https://github.com/llvm/llvm-project/pull/138337 Remove FEM_Indeterminable as it is unused and cannot be stored safely in an unsigned bitfield >From f104804f217b31f6c41559fc3b388a9c8f0f7571 Mon Sep 17 00:00:00 2001 From: Oliver Hunt Date: Fri, 2 May 2025 13:05:52 -0700 Subject: [PATCH] [clang] Remove FEM_Indeterminable (#137661) Remove FEM_Indeterminable as it is unused and cannot be stored safely in an unsigned bitfield --- clang/include/clang/Basic/LangOptions.h | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/clang/include/clang/Basic/LangOptions.h b/clang/include/clang/Basic/LangOptions.h index 16c35bcf49339..d16f32bf618ee 100644 --- a/clang/include/clang/Basic/LangOptions.h +++ b/clang/include/clang/Basic/LangOptions.h @@ -299,10 +299,7 @@ class LangOptionsBase { }; /// Possible float expression evaluation method choices. - enum FPEvalMethodKind { -/// The evaluation method cannot be determined or is inconsistent for this -/// target. -FEM_Indeterminable = -1, + enum FPEvalMethodKind : unsigned { /// Use the declared type for fp arithmetic. FEM_Source = 0, /// Use the type double for fp arithmetic. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Remove FEM_Indeterminable (#137661) (PR #138337)
ojhunt wrote: @tstellar hi, I'm not sure the process for merging to a release branch, this is a minor correctness fix but it prevents a massive amount of warning spam when building with more current clangs. cc'ing @AaronBallman and @zahiraam so they see this and can weigh in on whether it is reasonable to pick this up. https://github.com/llvm/llvm-project/pull/138337 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Remove FEM_Indeterminable (#137661) (PR #138337)
ojhunt wrote: @zahiraam @AaronBallman I'm not sure if this warrants the cherry pick though - because we cannot actually trigger the use of FEM_Indeterminable the fact that the behavior would be wrong isn't relevant, the problem is just the warning spam you get while building with trunk clang - which could be mitigated by people disabling the warning in their cmake config? https://github.com/llvm/llvm-project/pull/138337 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [clang] Remove FEM_Indeterminable (#137661) (PR #138337)
ojhunt wrote: > > I agree, it's literally just a warning correction, and people building this > > release of llvm with a newer clang can configure it to disable this warning > > - I just wasn't sure what the general policy around this kind of thing was > > which is why I made a PR and asked for advice :D > > @AaronBallman that said I'm not sure this change to the underlying type > > would impact anything - the unsigned and int are the same size, and the > > value is stored in an explicitly `unsigned` bitfield which is the source of > > the warning. > > FWIW, I was thinking less about the size and more about the range of valid > values for the enumeration. ah - I thought you were thinking in terms of the ms padding behavior :D https://github.com/llvm/llvm-project/pull/138337 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
ojhunt wrote: @pcc and I have been discussing this. * The perf issues I was concerned about were predicated on access to a pointer loaded from a field continuing to be checked after the original field load, this is not the case (and in hindsight doing so would imply passing the pointer as a parameter to a function would maintain the tag and require the target knowing about it). * I have a better understanding of why the offset behavior is required - it is easier for me to reason about when considering reference parameters as those are much more ubiquitous and invisible to the user, so requiring annotations or breaking up such calls isn't really an option. I still don't like the current approach but I have yet to think of anything that would be better :-/ * For the POD ABI fixups I realized what I really want is just to have the codegen separated out - if it turns out there are cases where the codegen size is excessive I'm sure we could add a heuristic for outlining it if needed, but I think the real issue for me now was actually just that the fixup codegen was inline. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -7538,6 +7538,14 @@ static bool IsEligibleForTrivialRelocation(Sema &SemaRef, if (!SemaRef.IsCXXTriviallyRelocatableType(Field->getType())) return false; } + + // FIXME: PFP should not affect trivial relocatability, instead it should + // affect the implementation of std::trivially_relocate. See: + // https://discourse.llvm.org/t/rfc-structure-protection-a-family-of-uaf-mitigation-techniques/8/16?u=pcc + if (!SemaRef.Context.arePFPFieldsTriviallyRelocatable(D) && ojhunt wrote: If PFP fields are present, and we haven't yet implemented support for them in `trivially_relocate`, `IsCXXTriviallyRelocatableType` should be returning false. This is something we have to address for address discriminated pointer auth as well. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -1319,14 +1319,66 @@ static llvm::Value *CoerceIntOrPtrToIntOrPtr(llvm::Value *Val, llvm::Type *Ty, /// This safely handles the case when the src type is smaller than the /// destination type; in this situation the values of bits which not /// present in the src are undefined. -static llvm::Value *CreateCoercedLoad(Address Src, llvm::Type *Ty, +static llvm::Value *CreateCoercedLoad(Address Src, QualType SrcFETy, + llvm::Type *Ty, CodeGenFunction &CGF) { llvm::Type *SrcTy = Src.getElementType(); // If SrcTy and Ty are the same, just do a load. if (SrcTy == Ty) return CGF.Builder.CreateLoad(Src); + // Coercion directly through memory does not work if the structure has pointer + // field protection because the struct in registers has a different bit + // pattern to the struct in memory, so we must read the elements one by one + // and use them to form the coerced structure. + std::vector PFPFields; + CGF.getContext().findPFPFields(SrcFETy, CharUnits::Zero(), PFPFields, true); ojhunt wrote: This should be done by generating a copy thunk and calling that, rather than performing this logic repeatedly, and more importantly regenerating it repeatedly. If the update function is small enough it will be inlined, and if it's not you probably do not want a large number of copies of the code. Your intent appears to be to maintain the register transfer abi for POD types, despite the objects logically no longer being POD types - which isn't unreasonable given that the current design means that all structs containing pointers would become non-POD, which is probably suboptimal - but if you are doing so, because a large enough struct can spill, I'd recommend you adopt a model where the PFP fields in a return or parameter copy of the object are tagged specifically as transfer fields, something akin to `hash(argument position || PFP offset || some_this_is_a_transfer_constant)`. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
https://github.com/ojhunt requested changes to this pull request. Thoughts: This should be opt-in on a field or struct granularity, not just a global behavior. In the RFC I think you mentioned not applying PFP to C types, but I'm unsure how you're deciding what is a C type? There are a lot of special cases being added in places that should not need to be aware of PFP - the core type queries should be returning correct values for types containing PFP fields. A lot of the "this read/write/copy/init special work" exactly aligns with the constraints of pointer auth, and I think the overall implementation could be made better by introducing the concept of a `HardenedPointerQualifier` or similar, that could be either PointerAuthQualifier or PFPQualifier. In principle this could even just be treated as a different PointerAuth key set. That approach would also help with some of the problems you have with offsetof and pointers to PFP fields - the thing that causes the problems with the pointer to a PFP field is that the PFP schema for a given field is in reality part of the type, so given ```cpp struct A { void *field1; void *field2; }; ``` It looks you are trying to maintain the idea that `A::field1` and `A::field2` have the same type, which makes this code valid according to the type system: ```cpp void f(void** obj); void g(A *a) { f(&a->field1); f(&a->field2); } ``` but the real types are not the same. The semantic equivalent under pointer auth is something akin to ```cpp struct A { void * __ptrauth(1, 1, hash("A::field1")) field1; void * __ptrauth(1, 1, hash("A::field2")) field2; }; ``` Which makes it very clear that the types are different, and I think trying to pretend they are not is part of what is causing problems. The more I think about it, the more I feel that this could be implemented almost entirely on top of the existing pointer auth work. Currently for pointer auth there's a set of ARM keys specified, and I think you just need to create a different set, PFP_Keys or similar, and set up the appropriate schemas (basically null schemas for all the existing cases), and add a new schema: DefaultFieldSchema or similar, and apply that schema to every pointer field in an object unless there's an explicit qualifier applied already. Then it would in principle just be a matter of making adding the appropriate logic to the ptrauth intrinsics in llvm. Now there are some moderately large caveats to that idea * the existing ptrauth semantics simply say that any struct containing address discriminated values is no-longer a pod type and so gets copied through the stack, which is something you're trying to avoid, so you'd still need to keep that logic (though as I said, that should be done by generating and then calling the functions to do rather than regenerating the entire copy repeatedly). * the way that pointer auth is currently implemented assumes that there is only a single pointer auth abi active so you wouldn't be able to trivially have both pointer auth and PFP live at the same time. That's what makes me think having a SecuredPointer abstraction might be a batter approach, but it might also not be too difficult to update the PointerAuthQualifier to include the ABI being used -- I'm really not sure. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -2976,7 +3006,15 @@ void CodeGenFunction::EmitForwardingCallToLambda( QualType resultType = FPT->getReturnType(); ReturnValueSlot returnSlot; if (!resultType->isVoidType() && - calleeFnInfo->getReturnInfo().getKind() == ABIArgInfo::Indirect && + (calleeFnInfo->getReturnInfo().getKind() == ABIArgInfo::Indirect || + // With pointer field protection, we need to set up the return slot when ojhunt wrote: This should also not need to be modified. Something is up with how you're setting up the record decls if you need to have so many insertions of PFP specific behavior. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -362,6 +362,17 @@ class LangOptionsBase { BKey }; + enum class PointerFieldProtectionKind { ojhunt wrote: I'm not sure I like this being solely a global decision - it makes custom allocators much harder, and it makes it hard for allocators that have different policies, e.g ```cpp struct ImportantStruct { void *operator new(size_t) { /*tagged allocator*/ } }; struct BoringStruct { ... }; struct SomeOtherStruct { ImportantStruct *important; // should be tagged pointer BoringString *lessImportant; // should be a non tagged pointer }; ``` This would again require a struct attribute to indicate the pointer policy https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -928,6 +936,11 @@ namespace { if (PointerAuthQualifier Q = F->getType().getPointerAuth(); Q && Q.isAddressDiscriminated()) return false; + // Non-trivially-copyable fields with pointer field protection need to be ojhunt wrote: This is an example of the code I was thinking of when I talked about sharing code with PointerAuth, the PFP and pointer auth code is essentially identical here. Essentially any place that interacts with a pointer auth qualifier should also have PFP code, and the code is going to be (at the place of interaction, not the underlying implementation) going to be essentially the same. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -3011,6 +3011,12 @@ defm experimental_omit_vtable_rtti : BoolFOption<"experimental-omit-vtable-rtti" NegFlag, BothFlags<[], [CC1Option], " the RTTI component from virtual tables">>; +def experimental_pointer_field_protection_EQ : Joined<["-"], "fexperimental-pointer-field-protection=">, Group, ojhunt wrote: As above, not super sure that I like this being a global compile time setting, but maybe it makes sense to be. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -2268,13 +2293,22 @@ CodeGenFunction::EmitNullInitialization(Address DestPtr, QualType Ty) { // Get and call the appropriate llvm.memcpy overload. Builder.CreateMemCpy(DestPtr, SrcPtr, SizeVal, false); -return; + } else { +// Otherwise, just memset the whole thing to zero. This is legal ojhunt wrote: This means types that require constructors will be doing zero initialization, then reinitializing fields. From a codegen PoV this can lead to codegen along the lines of ``` zero the memory; // compiler is set to initialize all memory zero the memory; // this branch means objects that are not zero initializable get initialized again initialize the memory; // the struct is not zero initializable the constructor/initializer will run ``` Ideally the compiler will optimized this down, but it's both extra codegen, and extra optimization work. Given that your model assumes that null is a safe value for PFP fields, the `isZeroInitializable()` call should return true. For non-zero initializable objects, the initialization code for the object is responsible for initializing the PFP fields. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -2513,6 +2513,12 @@ def CountedByOrNull : DeclOrTypeAttr { let LangOpts = [COnly]; } +def NoPointerFieldProtection : DeclOrTypeAttr { ojhunt wrote: This an ABI break so I don't think it can reasonably an on by default for all structs - we can already see annotations in libc++, and they would be needed on every single struct field. We can imagine a new platform where this would be the platform ABI, but for every other case it is functionally unusable. I would recommend attributes to permit opt in and opt out on a per-struct basis, and CLI flags to select the default behavior. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
https://github.com/ojhunt edited https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
https://github.com/ojhunt edited https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
https://github.com/ojhunt commented: unfortunately I really don't know enough about actual backend codegen to comment on the actual codegen implementation here, but I think there's some design issues with having the backend determine the discriminators rather than having those selected in the front end https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -441,6 +445,254 @@ bool PreISelIntrinsicLowering::expandMemIntrinsicUses(Function &F) const { return Changed; } +namespace { + +enum class PointerEncoding { + Rotate, + PACCopyable, + PACNonCopyable, +}; + +bool expandProtectedFieldPtr(Function &Intr) { + Module &M = *Intr.getParent(); + bool IsAArch64 = Triple(M.getTargetTriple()).isAArch64(); + + std::set NonPFPFields; + std::set LoadsStores; + + Type *Int8Ty = Type::getInt8Ty(M.getContext()); + Type *Int64Ty = Type::getInt64Ty(M.getContext()); + PointerType *PtrTy = PointerType::get(M.getContext(), 0); + + Function *SignIntr = + Intrinsic::getOrInsertDeclaration(&M, Intrinsic::ptrauth_sign, {}); + Function *AuthIntr = + Intrinsic::getOrInsertDeclaration(&M, Intrinsic::ptrauth_auth, {}); + + auto *EmuFnTy = FunctionType::get(Int64Ty, {Int64Ty, Int64Ty}, false); + FunctionCallee EmuSignIntr = M.getOrInsertFunction("__emupac_pacda", EmuFnTy); + FunctionCallee EmuAuthIntr = M.getOrInsertFunction("__emupac_autda", EmuFnTy); + + auto CreateSign = [&](IRBuilder<> &B, Value *Val, Value *Disc, + OperandBundleDef DSBundle) { +Function *F = B.GetInsertBlock()->getParent(); +Attribute FSAttr = F->getFnAttribute("target-features"); +if (FSAttr.isValid() && FSAttr.getValueAsString().contains("+pauth")) + return B.CreateCall(SignIntr, {Val, B.getInt32(2), Disc}, DSBundle); +return B.CreateCall(EmuSignIntr, {Val, Disc}, DSBundle); + }; + + auto CreateAuth = [&](IRBuilder<> &B, Value *Val, Value *Disc, + OperandBundleDef DSBundle) { +Function *F = B.GetInsertBlock()->getParent(); +Attribute FSAttr = F->getFnAttribute("target-features"); +if (FSAttr.isValid() && FSAttr.getValueAsString().contains("+pauth")) + return B.CreateCall(AuthIntr, {Val, B.getInt32(2), Disc}, DSBundle); +return B.CreateCall(EmuAuthIntr, {Val, Disc}, DSBundle); + }; + + for (User *U : Intr.users()) { +auto *Call = cast(U); +auto *FieldName = cast( ojhunt wrote: I'm not really a backend person, but I feel that it's incorrect for the discriminator selection/computation to be occurring in llvm. I think clang should be passing the discriminator as a parameter to the intrinsics - this would also permit the PFP system to allow users to override the discriminator for fields, which is likely to end up being necessary, as time changes the exact types or naming of fields (side note: because I'm a muppet, I only just realized that the current model derives the discriminator from the name of the field which means the names of fields actually becomes ABI - it might be better to use something like "name of struct + offset in struct + type" though obviously that reduces the variation in discriminators - OTOH it's not uncommon for different projects to declare there own versions of POD structs with slightly different naming so maybe it helps there) https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -544,6 +544,7 @@ TYPE_TRAIT_2(__is_pointer_interconvertible_base_of, IsPointerInterconvertibleBas #include "clang/Basic/TransformTypeTraits.def" // Clang-only C++ Type Traits +TYPE_TRAIT_1(__has_non_relocatable_fields, HasNonRelocatableFields, KEYCXX) ojhunt wrote: I do not like this -- there are existing functions for handling whether an object is trivially copying, relocatable, etc that should just be updated to correctly report the traits of impacted types. See the various Sema functions querying `isAddressDiscriminated()`. That function currently only cares about pointer auth, but it would not be unreasonable to have it consider other properties, like PDP. In fact a lot of the behavior you're wanting in this feature is essentially covered by the PointerAuthQualfier. It would perhaps be reasonable to generalize that to something akin to SecurePointerQualifier that could be either a PAQ or PFP qualifier. Then most of the current semantic queries made to support pointer auth could just be a single shared implementation. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -1415,6 +1469,52 @@ void CodeGenFunction::CreateCoercedStore(llvm::Value *Src, Address Dst, } } + // Coercion directly through memory does not work if the structure has pointer + // field protection because the struct passed by value has a different bit + // pattern to the struct in memory, so we must read the elements one by one + // and use them to form the coerced structure. + std::vector PFPFields; + getContext().findPFPFields(SrcFETy, CharUnits::Zero(), PFPFields, true); ojhunt wrote: Same as for CoercedLoad :D https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] CodeGen: Fix implementation of __builtin_trivially_relocate. (PR #140312)
https://github.com/ojhunt requested changes to this pull request. https://github.com/llvm/llvm-project/pull/140312 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] CodeGen: Fix implementation of __builtin_trivially_relocate. (PR #140312)
@@ -8,7 +8,7 @@ struct S trivially_relocatable_if_eligible { }; // CHECK: @_Z4testP1SS0_ -// CHECK: call void @llvm.memmove.p0.p0.i64 +// CHECK: call void @llvm.memmove.p0.p0.i64({{.*}}, i64 8 ojhunt wrote: given this was a size issue, I think it's important this memmove check has the expected count, so can we have a few tests for multiple *different* counts? https://github.com/llvm/llvm-project/pull/140312 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
ojhunt wrote: > Hi Oliver, thanks for your comments! I'll address them below. > > > Thoughts: > > This should be opt-in on a field or struct granularity, not just a global > > behavior. > > This would certainly be easier if it were an opt-in behavior, as it would > allow avoiding a substantial amount of complexity that you point out below. > The problem is that would not make the solution very effective as a UAF > mitigation, as it would only help with code that was opted in, which is > likely a tiny fraction of an entire codebase. That fraction is unlikely to be > the fraction with UAF bugs, not only because of the size of the fraction but > also because if developers aren't thinking about memory safety issues, they > certainly won't be manually opting into this. With PFP, the problem that I > set out to solve was to find a way to automatically protect as many pointers > stored in memory as possible in standards-conforming code. > There are a significant number of standard layout types that contain meaningfully powerful data if they are compromised, which leads to saying the standard layout types should also be covered. That said, there's a significant cost to applying pointer protection universally though. The cost of existing mitigations rely on them not being global properties as they can saturate the required compute resources. I would be interested in a performance comparison of enumerating struct fields as you increase the number of fields being accessed just to see if there's a point that perf starts to collapse - I suspect there would be, but if the number of concurrent (from the cpu pov, not threading) fields accesses to get to that point is high enough it may not matter. Similar tests for increasing levels of dependent loads are also probably worth while. The intent is to make this behavior at least partially universal (all pointers in objects that are not standard layout), which means that extrapolating from individual loads, or loads from the same addresses, are not necessarily representative. In the RFC you talk about the time and/or cycles required to perform operations, but that's the best case, where the required units are not saturated. This gate on performance isn't unique to pointer protections: every operation consumes some amount of cpu resources, once those resources are saturated the pipeline stalls, so it's necessary to compare performance when many of these operations are happening in succession. Now there's some caching (saying the tag remains constant, etc) you might be able to do, but that's dependent on being willing to say "this is a software mitigation, so we're ok if we don't catch everything" - these are things that a cpu can mitigate in hardware via direct knowledge of what is happening on a given core, and peeking on the memory bus. (There are also performance costs when you're accessing the same memory on different cores, but that goes so bad so quickly even in normal cases I'm not concerned about that yet). > > In the RFC I think you mentioned not applying PFP to C types, but I'm > > unsure how you're deciding what is a C type? > > This is based on whether the type has standard layout. C does not permit > declaring a type that is non-standard layout. The issue here is that there are many cases where standard layout structs contain important data, which is why I'm not 100% sold on excluding C types from this. > > There are a lot of special cases being added in places that should not need > > to be aware of PFP - the core type queries should be returning correct > > values for types containing PFP fields. > > Agreed on type queries. The current adjustment to how trivially-relocatable > is defined is a workaround that will be removed once P2786 is fully > implemented in libc++, and at that point we shouldn't need > `__has_non_relocatable_fields` either. The intent is that turning on PFP > should be invisible to the developer (in other words, the modification to the > storage format of pointer fields in certain structs is permitted by the as-if > rule). As a result, most of the implementation is in CodeGen and below. My point was that this should not have a separate attempt to make trivial relocation work, it should make sure the type queries report the correct information, and the implementation of trivially_relocate agrees with that definition. Also trivial relocation isn't meaningfully supported or used anywhere yet - it's a new feature - so I don't think you should be prioritizing supporting that yet: just make sure that the (internal) trait queries return correct/consistent results for what can be handled, and worry about optimizing that later. > > > A lot of the "this read/write/copy/init special work" exactly aligns with > > the constraints of pointer auth, and I think the overall implementation > > could be made better by introducing the concept of a > > `HardenedPointerQualifier` or similar,
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -2201,6 +2215,22 @@ void CodeGenFunction::EmitCXXConstructorCall( EmitTypeCheck(CodeGenFunction::TCK_ConstructorCall, Loc, This, getContext().getRecordType(ClassDecl), CharUnits::Zero()); + // When initializing an object that has pointer field protection and whose + // fields are not trivially relocatable we must initialize any pointer fields + // to a valid signed pointer (any pointer value will do, but we just use null + // pointers). This is because if the object is subsequently copied, its copy + // constructor will need to read and authenticate any pointer fields in order + // to copy the object to a new address, which will fail if the pointers are + // uninitialized. + if (!getContext().arePFPFieldsTriviallyRelocatable(D->getParent())) { ojhunt wrote: Correctly initializing the object is the job of the constructor, not the caller. in cases where a constructor is being called, you should be ensuring that the constructor does the correct thing. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -544,6 +544,7 @@ TYPE_TRAIT_2(__is_pointer_interconvertible_base_of, IsPointerInterconvertibleBas #include "clang/Basic/TransformTypeTraits.def" // Clang-only C++ Type Traits +TYPE_TRAIT_1(__has_non_relocatable_fields, HasNonRelocatableFields, KEYCXX) ojhunt wrote: I think you misunderstand my intent (I tried to suggest in one the comments or the global comment): What I'm suggesting is that the qualifier would be implicitly added at the point of declaration, the overarching behavior you're after would be achieved that way. From your other comment about how you determine "C" strictness (just keying off standard layout) that might be challenging as it means you can't really finalize the types of fields until the struct definition is completed, and I don't know if there's anything else in C++ that behaves that way (@cor3ntin might have an idea). However I'm also not sure how reasonable it is to exclude standard layout types from this protection - what exactly is the rationale for that? (if the reason is a belief that substituting C++ libraries is fine but C libraries is not, I suspect that's not actually very sound, and silently ignoring some structs is similarly a hazard - it can result in "trivial" refactoring invisibly reducing security guarantees). Regarding P2786: the exact same issues apply to pointer authentication today - trying to solve them separately and outside of the existing triviality queries and operations is not the correct approach. I'm ok with the logic being distinct from pointer auth, though I would regret that as I believe the shared semantics warrant a shared implementation, but there is no reason for them to be distinct from any of the type queries. If you want the types to be trivially relocatable - which is not a requirement, it's a performance optimization - then ensure the existing queries and operations handle that correctly. Otherwise you run the risk of other code in clang using the existing trivial behavior queries, and expecting them to be correct, when they are not. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -1319,14 +1319,66 @@ static llvm::Value *CoerceIntOrPtrToIntOrPtr(llvm::Value *Val, llvm::Type *Ty, /// This safely handles the case when the src type is smaller than the /// destination type; in this situation the values of bits which not /// present in the src are undefined. -static llvm::Value *CreateCoercedLoad(Address Src, llvm::Type *Ty, +static llvm::Value *CreateCoercedLoad(Address Src, QualType SrcFETy, + llvm::Type *Ty, CodeGenFunction &CGF) { llvm::Type *SrcTy = Src.getElementType(); // If SrcTy and Ty are the same, just do a load. if (SrcTy == Ty) return CGF.Builder.CreateLoad(Src); + // Coercion directly through memory does not work if the structure has pointer + // field protection because the struct in registers has a different bit + // pattern to the struct in memory, so we must read the elements one by one + // and use them to form the coerced structure. + std::vector PFPFields; + CGF.getContext().findPFPFields(SrcFETy, CharUnits::Zero(), PFPFields, true); ojhunt wrote: It still doesn't make things worse than generating (and caching) a function, if the function is small then the inliner handles it - though here we're getting into: you should just be generating the correct copy, move, etc functions for the type, and all the other behaviors fall out of that automatically. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -7538,6 +7538,14 @@ static bool IsEligibleForTrivialRelocation(Sema &SemaRef, if (!SemaRef.IsCXXTriviallyRelocatableType(Field->getType())) return false; } + + // FIXME: PFP should not affect trivial relocatability, instead it should + // affect the implementation of std::trivially_relocate. See: + // https://discourse.llvm.org/t/rfc-structure-protection-a-family-of-uaf-mitigation-techniques/8/16?u=pcc + if (!SemaRef.Context.arePFPFieldsTriviallyRelocatable(D) && ojhunt wrote: I think it would be reasonable to make a general `ASTContext::HasAddressDiscriminatedData(...)` ( or similar method that can check for pointer auth and PFP, and an associated update of the existing `hasAddressDiscriminatedPointerAuth` calls to use that function for semantic checks related to "does this contain values with storage derived representation" at least. Don't quote me on the naming though, because I am terrible at that. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -2268,13 +2293,22 @@ CodeGenFunction::EmitNullInitialization(Address DestPtr, QualType Ty) { // Get and call the appropriate llvm.memcpy overload. Builder.CreateMemCpy(DestPtr, SrcPtr, SizeVal, false); -return; + } else { +// Otherwise, just memset the whole thing to zero. This is legal ojhunt wrote: I'll re-read this section once you deal with the misery that is indentation rules :-/ https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Add pointer field protection feature. (PR #133538)
@@ -441,6 +445,254 @@ bool PreISelIntrinsicLowering::expandMemIntrinsicUses(Function &F) const { return Changed; } +namespace { + +enum class PointerEncoding { + Rotate, + PACCopyable, + PACNonCopyable, +}; + +bool expandProtectedFieldPtr(Function &Intr) { + Module &M = *Intr.getParent(); + bool IsAArch64 = Triple(M.getTargetTriple()).isAArch64(); + + std::set NonPFPFields; + std::set LoadsStores; + + Type *Int8Ty = Type::getInt8Ty(M.getContext()); + Type *Int64Ty = Type::getInt64Ty(M.getContext()); + PointerType *PtrTy = PointerType::get(M.getContext(), 0); + + Function *SignIntr = + Intrinsic::getOrInsertDeclaration(&M, Intrinsic::ptrauth_sign, {}); + Function *AuthIntr = + Intrinsic::getOrInsertDeclaration(&M, Intrinsic::ptrauth_auth, {}); + + auto *EmuFnTy = FunctionType::get(Int64Ty, {Int64Ty, Int64Ty}, false); + FunctionCallee EmuSignIntr = M.getOrInsertFunction("__emupac_pacda", EmuFnTy); + FunctionCallee EmuAuthIntr = M.getOrInsertFunction("__emupac_autda", EmuFnTy); + + auto CreateSign = [&](IRBuilder<> &B, Value *Val, Value *Disc, + OperandBundleDef DSBundle) { +Function *F = B.GetInsertBlock()->getParent(); +Attribute FSAttr = F->getFnAttribute("target-features"); +if (FSAttr.isValid() && FSAttr.getValueAsString().contains("+pauth")) + return B.CreateCall(SignIntr, {Val, B.getInt32(2), Disc}, DSBundle); +return B.CreateCall(EmuSignIntr, {Val, Disc}, DSBundle); + }; + + auto CreateAuth = [&](IRBuilder<> &B, Value *Val, Value *Disc, + OperandBundleDef DSBundle) { +Function *F = B.GetInsertBlock()->getParent(); +Attribute FSAttr = F->getFnAttribute("target-features"); +if (FSAttr.isValid() && FSAttr.getValueAsString().contains("+pauth")) + return B.CreateCall(AuthIntr, {Val, B.getInt32(2), Disc}, DSBundle); +return B.CreateCall(EmuAuthIntr, {Val, Disc}, DSBundle); + }; + + for (User *U : Intr.users()) { +auto *Call = cast(U); +auto *FieldName = cast( ojhunt wrote: that would definitely be my preference, and I _think_ it should end up being simpler. https://github.com/llvm/llvm-project/pull/133538 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits