https://github.com/shiltian created https://github.com/llvm/llvm-project/pull/131557
The value of a null pointer is not always `0`. For example, on AMDGPU, the null pointer in address spaces 3 and 5 is `0xffffffff`. Currently, there is no target-independent way to get this information, making it difficult and error-prone to handle null pointers in target-agnostic code. We do have `ConstantPointerNull`, but it might be a little confusing and misleading. It represents a pointer with an all-zero value rather than necessarily a real `nullptr`. This PR introduces the concept of a *sentinel pointer value* to `DataLayout`, representing the actual `nullptr` value for a given address space. The changes include: - A new interface function: ``` APInt getSentinelPointerValue(unsigned AS) ``` This function returns an `APInt` representing the sentinel pointer value for the given address space `AS`. An `APInt` is used instead of a literal integer to support cases where pointers are wider than 64 bits (e.g., AMDGPU’s address space 8). - An extension to the data layout string format: ``` p[n]:<size>:<abi>[:<pref>[:<idx>[:<sentinel>]]] ``` The new `<sentinel>` component specifies the sentinel value for the corresponding pointer. It currently supports two values: - `0` for an all-zero value - `f` for a full-bit set value These two values are the most common representations of `nullptr`. It is unlikely that any target would define `nullptr` as a random value. A follow-up patch series will introduce an equivalent of `ConstantPointerNull` that represents the actual `nullptr`, built on top of this PR. >From c966236e1ec99fbd45063e0363873c8d8abe4c50 Mon Sep 17 00:00:00 2001 From: Shilei Tian <i...@tianshilei.me> Date: Sun, 16 Mar 2025 23:43:43 -0400 Subject: [PATCH] [DataLayout] Introduce sentinel pointer value MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The value of a null pointer is not always `0`. For example, on AMDGPU, the null pointer in address spaces 3 and 5 is `0xffffffff`. Currently, there is no target-independent way to get this information, making it difficult and error-prone to handle null pointers in target-agnostic code. We do have `ConstantPointerNull`, but it might be a little confusing and misleading. It represents a pointer with an all-zero value rather than necessarily a real `nullptr`. This PR introduces the concept of a *sentinel pointer value* to `DataLayout`, representing the actual `nullptr` value for a given address space. The changes include: - A new interface function: ``` APInt getSentinelPointerValue(unsigned AS) ``` This function returns an `APInt` representing the sentinel pointer value for the given address space `AS`. An `APInt` is used instead of a literal integer to support cases where pointers are wider than 64 bits (e.g., AMDGPU’s address space 8). - An extension to the data layout string format: ``` p[n]:<size>:<abi>[:<pref>[:<idx>[:<sentinel>]]] ``` The new `<sentinel>` component specifies the sentinel value for the corresponding pointer. It currently supports two values: - `0` for an all-zero value - `f` for a full-bit set value These two values are the most common representations of `nullptr`. It is unlikely that any target would define `nullptr` as a random value. A follow-up patch series will introduce an equivalent of `ConstantPointerNull` that represents the actual `nullptr`, built on top of this PR. --- clang/lib/Basic/Targets/AMDGPU.cpp | 6 ++-- llvm/docs/LangRef.rst | 11 +++--- llvm/include/llvm/IR/DataLayout.h | 8 ++++- llvm/lib/IR/DataLayout.cpp | 54 ++++++++++++++++++++++------ llvm/unittests/IR/DataLayoutTest.cpp | 20 +++++++++-- 5 files changed, 78 insertions(+), 21 deletions(-) diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp b/clang/lib/Basic/Targets/AMDGPU.cpp index a42b4589fb5ac..1506387a41d96 100644 --- a/clang/lib/Basic/Targets/AMDGPU.cpp +++ b/clang/lib/Basic/Targets/AMDGPU.cpp @@ -32,9 +32,9 @@ static const char *const DataLayoutStringR600 = "-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1"; static const char *const DataLayoutStringAMDGCN = - "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32" - "-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:" - "32-v48:64-v96:128" + "e-p:64:64-p1:64:64-p2:32:32-p3:32:32:32:32:f-p4:64:64-p5:32:32:32:32:f" + "-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16" + "-v24:32-v32:32-v48:64-v96:128" "-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1" "-ni:7:8:9"; diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index c0567090fdd2a..a22020da38f83 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -3134,7 +3134,7 @@ as follows: ``A<address space>`` Specifies the address space of objects created by '``alloca``'. Defaults to the default address space of 0. -``p[n]:<size>:<abi>[:<pref>][:<idx>]`` +``p[<n>]:<size>:<abi>[:<pref>[:<idx>[:<sentinel>]]]`` This specifies the *size* of a pointer and its ``<abi>`` and ``<pref>``\erred alignments for address space ``n``. ``<pref>`` is optional and defaults to ``<abi>``. The fourth parameter ``<idx>`` is the size of the @@ -3143,7 +3143,10 @@ as follows: specified, the default index size is equal to the pointer size. All sizes are in bits. The address space, ``n``, is optional, and if not specified, denotes the default address space 0. The value of ``n`` must be - in the range [1,2^24). + in the range [1,2^24). The fifth parameter ``<sentinel>`` specifies the + sentinel value of the pointer for the corresponding address space. It + currently accepts two values: ``0`` for an all-zero value and ``f`` for a + full-bit set value. The default sentinel pointer value is all-zero. ``i<size>:<abi>[:<pref>]`` This specifies the alignment for an integer type of a given bit ``<size>``. The value of ``<size>`` must be in the range [1,2^24). @@ -13045,9 +13048,9 @@ This instruction requires several arguments: - Caller and callee both have the calling convention ``fastcc`` or ``tailcc``. - The call is in tail position (ret immediately follows call and ret uses value of call or is void). - - Option ``-tailcallopt`` is enabled, ``llvm::GuaranteedTailCallOpt`` is + - Option ``-tailcallopt`` is enabled, ``llvm::GuaranteedTailCallOpt`` is ``true``, or the calling convention is ``tailcc``. - - `Platform-specific constraints are met. + - `Platform-specific constraints are met. <CodeGenerator.html#tail-call-optimization>`_ #. The optional ``notail`` marker indicates that the optimizers should not add diff --git a/llvm/include/llvm/IR/DataLayout.h b/llvm/include/llvm/IR/DataLayout.h index 2ad080e6d0cd2..1257ffb32a4df 100644 --- a/llvm/include/llvm/IR/DataLayout.h +++ b/llvm/include/llvm/IR/DataLayout.h @@ -78,6 +78,7 @@ class DataLayout { Align ABIAlign; Align PrefAlign; uint32_t IndexBitWidth; + APInt SentinelValue; /// Pointers in this address space don't have a well-defined bitwise /// representation (e.g. may be relocated by a copying garbage collector). /// Additionally, they may also be non-integral (i.e. containing additional @@ -148,7 +149,7 @@ class DataLayout { /// Sets or updates the specification for pointer in the given address space. void setPointerSpec(uint32_t AddrSpace, uint32_t BitWidth, Align ABIAlign, Align PrefAlign, uint32_t IndexBitWidth, - bool IsNonIntegral); + APInt SentinelValue, bool IsNonIntegral); /// Internal helper to get alignment for integer of given bitwidth. Align getIntegerAlignment(uint32_t BitWidth, bool abi_or_pref) const; @@ -552,6 +553,11 @@ class DataLayout { /// /// This includes an explicitly requested alignment (if the global has one). Align getPreferredAlign(const GlobalVariable *GV) const; + + /// Returns the sentinel pointer value for a given address space. If the + /// address space is invalid, it defaults to the sentinel pointer value of + /// address space 0, aligning with the behavior of \p getPointerSpec. + APInt getSentinelPointerValue(unsigned AS) const; }; inline DataLayout *unwrap(LLVMTargetDataRef P) { diff --git a/llvm/lib/IR/DataLayout.cpp b/llvm/lib/IR/DataLayout.cpp index 0cf0bfc9702d3..058775d68465b 100644 --- a/llvm/lib/IR/DataLayout.cpp +++ b/llvm/lib/IR/DataLayout.cpp @@ -206,9 +206,10 @@ constexpr DataLayout::PrimitiveSpec DefaultVectorSpecs[] = { }; // Default pointer type specifications. -constexpr DataLayout::PointerSpec DefaultPointerSpecs[] = { - // p0:64:64:64:64 - {0, 64, Align::Constant<8>(), Align::Constant<8>(), 64, false}, +const DataLayout::PointerSpec DefaultPointerSpecs[] = { + // p0:64:64:64:64:0 + {0, 64, Align::Constant<8>(), Align::Constant<8>(), 64, APInt(64, 0), + false}, }; DataLayout::DataLayout() @@ -296,6 +297,22 @@ static Error parseSize(StringRef Str, unsigned &BitWidth, return Error::success(); } +static Error parseSentinelValue(StringRef Str, APInt &V) { + if (Str.empty()) + return createStringError("sentinel value component cannot be empty"); + if (Str.size() != 1) + return createStringError("sentinel value component must be a '0' or 'f'"); + if (Str[0] == '0') { + V.clearAllBits(); + return Error::success(); + } + if (Str[0] == 'f') { + V.setAllBits(); + return Error::success(); + } + return createStringError("sentinel value component must be a '0' or 'f'"); +} + /// Attempts to parse an alignment component of a specification. /// /// On success, returns the value converted to byte amount in \p Alignment. @@ -409,13 +426,14 @@ Error DataLayout::parseAggregateSpec(StringRef Spec) { } Error DataLayout::parsePointerSpec(StringRef Spec) { - // p[<n>]:<size>:<abi>[:<pref>[:<idx>]] + // p[<n>]:<size>:<abi>[:<pref>[:<idx>[:<sentinel>]]] SmallVector<StringRef, 5> Components; assert(Spec.front() == 'p'); Spec.drop_front().split(Components, ':'); - if (Components.size() < 3 || Components.size() > 5) - return createSpecFormatError("p[<n>]:<size>:<abi>[:<pref>[:<idx>]]"); + if (Components.size() < 3 || Components.size() > 6) + return createSpecFormatError( + "p[<n>]:<size>:<abi>[:<pref>[:<idx>[:<sentinel>]]]"); // Address space. Optional, defaults to 0. unsigned AddrSpace = 0; @@ -454,8 +472,14 @@ Error DataLayout::parsePointerSpec(StringRef Spec) { return createStringError( "index size cannot be larger than the pointer size"); + APInt SentinelValue(BitWidth, 0); + if (Components.size() > 5) { + if (Error Err = parseSentinelValue(Components[5], SentinelValue)) + return Err; + } + setPointerSpec(AddrSpace, BitWidth, ABIAlign, PrefAlign, IndexBitWidth, - false); + SentinelValue, /*IsNonIntegral=*/false); return Error::success(); } @@ -631,7 +655,7 @@ Error DataLayout::parseLayoutString(StringRef LayoutString) { // the spec for AS0, and we then update that to mark it non-integral. const PointerSpec &PS = getPointerSpec(AS); setPointerSpec(AS, PS.BitWidth, PS.ABIAlign, PS.PrefAlign, PS.IndexBitWidth, - true); + PS.SentinelValue, /*IsNonIntegral=*/true); } return Error::success(); @@ -679,16 +703,19 @@ DataLayout::getPointerSpec(uint32_t AddrSpace) const { void DataLayout::setPointerSpec(uint32_t AddrSpace, uint32_t BitWidth, Align ABIAlign, Align PrefAlign, - uint32_t IndexBitWidth, bool IsNonIntegral) { + uint32_t IndexBitWidth, APInt SentinelValue, + bool IsNonIntegral) { auto I = lower_bound(PointerSpecs, AddrSpace, LessPointerAddrSpace()); if (I == PointerSpecs.end() || I->AddrSpace != AddrSpace) { PointerSpecs.insert(I, PointerSpec{AddrSpace, BitWidth, ABIAlign, PrefAlign, - IndexBitWidth, IsNonIntegral}); + IndexBitWidth, SentinelValue, + IsNonIntegral}); } else { I->BitWidth = BitWidth; I->ABIAlign = ABIAlign; I->PrefAlign = PrefAlign; I->IndexBitWidth = IndexBitWidth; + I->SentinelValue = SentinelValue; I->IsNonIntegral = IsNonIntegral; } } @@ -1020,3 +1047,10 @@ Align DataLayout::getPreferredAlign(const GlobalVariable *GV) const { } return Alignment; } + +APInt DataLayout::getSentinelPointerValue(unsigned AS) const { + auto I = lower_bound(PointerSpecs, AS, LessPointerAddrSpace()); + if (I != PointerSpecs.end() || I->AddrSpace == AS) + return I->SentinelValue; + return PointerSpecs[0].SentinelValue; +} diff --git a/llvm/unittests/IR/DataLayoutTest.cpp b/llvm/unittests/IR/DataLayoutTest.cpp index 16a603ff6416f..79d46ebfd7d8e 100644 --- a/llvm/unittests/IR/DataLayoutTest.cpp +++ b/llvm/unittests/IR/DataLayoutTest.cpp @@ -313,11 +313,12 @@ TEST(DataLayout, ParsePointerSpec) { EXPECT_THAT_EXPECTED(DataLayout::parse(Str), Succeeded()); for (StringRef Str : - {"p", "p0", "p:32", "p0:32", "p:32:32:32:32:32", "p0:32:32:32:32:32"}) + {"p", "p0", "p:32", "p0:32", "p:32:32:32:32:32:0", "p0:32:32:32:32:32:0"}) EXPECT_THAT_EXPECTED( DataLayout::parse(Str), - FailedWithMessage("malformed specification, must be of the form " - "\"p[<n>]:<size>:<abi>[:<pref>[:<idx>]]\"")); + FailedWithMessage( + "malformed specification, must be of the form " + "\"p[<n>]:<size>:<abi>[:<pref>[:<idx>[:<sentinel>]]]\"")); // address space for (StringRef Str : {"p0x0:32:32", "px:32:32:32", "p16777216:32:32:32:32"}) @@ -401,6 +402,19 @@ TEST(DataLayout, ParsePointerSpec) { EXPECT_THAT_EXPECTED( DataLayout::parse(Str), FailedWithMessage("index size cannot be larger than the pointer size")); + + // sentinel value + for (StringRef Str : + {"p:32:32:32:32:a", "p0:32:32:32:32:ab", "p42:32:32:32:32:123"}) + EXPECT_THAT_EXPECTED( + DataLayout::parse(Str), + FailedWithMessage("sentinel value component must be a '0' or 'f'")); + + for (StringRef Str : + {"p:32:32:32:32:", "p0:32:32:32:32:", "p42:32:32:32:32:"}) + EXPECT_THAT_EXPECTED( + DataLayout::parse(Str), + FailedWithMessage("sentinel value component cannot be empty")); } TEST(DataLayoutTest, ParseNativeIntegersSpec) { _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits