llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT--> @llvm/pr-subscribers-clang Author: Sirui Mu (Lancern) <details> <summary>Changes</summary> This patch migrates the existing ClangIR documents that are written in Markdown format to reStructuredText format to align CIR's documents with clang's documentation policy. Closes #<!-- -->191850 . --- Patch is 182.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/192066.diff 4 Files Affected: - (removed) clang/docs/CIR/ABILowering.md (-556) - (added) clang/docs/CIR/ABILowering.rst (+589) - (removed) clang/docs/CIR/CleanupAndEHDesign.md (-1587) - (added) clang/docs/CIR/CleanupAndEHDesign.rst (+1631) ``````````diff diff --git a/clang/docs/CIR/ABILowering.md b/clang/docs/CIR/ABILowering.md deleted file mode 100644 index bcc29dc8544ca..0000000000000 --- a/clang/docs/CIR/ABILowering.md +++ /dev/null @@ -1,556 +0,0 @@ -# ClangIR ABI Lowering - Design Document - -## 1. Introduction - -This design describes calling convention lowering that builds on the LLVM ABI -Lowering Library in `llvm/lib/ABI/`: we use its `abi::Type*` and target ABI -logic and add an MLIR integration layer (ABITypeMapper, ABI lowering pass, and -dialect rewriters). The framework relies on the LLVM ABI library as the single -source of truth for ABI classification. MLIR dialects use it via an adapter -layer. The design provides a way to perform ABI-compliant calling convention -lowering that can be used by any MLIR dialect that implements the necessary -interfaces. Inputs are high-level function signatures in CIR, FIR, or other -MLIR dialect. Outputs are ABI-lowered signatures and call sites. Lowering -runs as an MLIR pass in the compilation pipeline. - -### 1.1 Design Goals - -Building on the LLVM ABI library and adding an MLIR integration layer avoids -duplicating complex ABI logic across MLIR dialects, reduces maintenance, and -keeps a single source of ABI compliance in `llvm/lib/ABI/`. The separation -between the ABI library (classification) and dialect-specific ABIRewriteContext -(rewriting) enables clearer testing and a straightforward migration path from -the CIR incubator by porting useful algorithms into the ABI library where -appropriate. - -A central goal is that generated code be call-compatible with Classic Clang -CodeGen and other compilers. Parity is with Classic Clang CodeGen output, -not only with the incubator. Success means CIR correctly lowers x86_64 and -AArch64 calling conventions with full ABI compliance using the LLVM ABI library -and MLIR integration layer; FIR can adopt the same infrastructure with minimal -dialect-specific adaptation (e.g. cdecl when calling C from Fortran). ABI -compliance will be validated through differential testing against Classic Clang -CodeGen, and performance overhead should remain under 5% compared to a direct, -dialect-specific implementation. Initial scope focuses on fixed-argument -functions; variadic support (varargs) is deferred. - -## 2. Background and Context - -### 2.1 What is Calling Convention Lowering? - -Calling convention lowering transforms high-level function signatures to match -target ABI (Application Binary Interface) requirements. When a function is -declared at the source level with convenient, language-level types, these types -must be translated into the specific register assignments, memory layouts, and -calling sequences that the target architecture expects. For example, on x86_64 -System V ABI, a struct containing two 64-bit integers might be "expanded" into -two separate arguments passed in registers, rather than being passed as a single -aggregate: - -``` -// High-level CIR -func @foo(i32, struct<i64, i64>) -> i32 - -// After ABI lowering -func @foo(i32 %arg0, i64 %arg1, i64 %arg2) -> i32 -// ^ ^ ^ ^ -// | | +--------+ struct expanded into fields -// | +---- first field passed in register -// +---- small integer passed in register -``` - -Calling convention lowering is complex for several reasons: it is highly -target-specific (each architecture has different rules for registers vs. -memory), type-dependent (rules differ for integers, floats, structs, unions, -arrays), and context-sensitive (varargs, virtual calls, conventions like -vectorcall or preserve_most). The same target may have multiple ABI variants -(e.g. x86_64 System V vs. Windows x64), adding further complexity. - -### 2.2 Existing Implementations - -#### Classic Clang CodeGen - -Classic Clang CodeGen (located in `clang/lib/CodeGen/`) transforms calling -conventions during the AST-to-LLVM-IR lowering process. This implementation is -mature and well-tested, handling all supported targets with comprehensive ABI -coverage. However, it's tightly coupled to both Clang's AST representation and -LLVM IR, making it difficult to reuse for MLIR-based frontends. - -#### CIR Incubator - -The CIR incubator includes a calling convention lowering pass in -`clang/lib/CIR/Dialect/Transforms/TargetLowering/` that transforms CIR -operations into ABI-lowered CIR operations as an MLIR pass. This implementation -successfully adapted logic from Classic Clang CodeGen to work within the MLIR -framework. However, it relies on CIR-specific types and operations, preventing -reuse by other MLIR dialects. - -#### LLVM ABI Lowering Library - -A 2025 Google Summer of Code project produced [PR -#140112](https://github.com/llvm/llvm-project/pull/140112), which proposes -extracting Clang's ABI logic into a reusable library in `llvm/lib/ABI/`. The -design centers on a shadow type system (`abi::Type*`) separate from both Clang's -AST types and LLVM IR types, enabling the ABI classification algorithms to work -independently of any specific frontend representation. The library includes -abstract `ABIInfo` base classes and target-specific implementations (e.g. -x86_64, BPF) and provides QualTypeMapper for Clang to map `QualType` to -`abi::Type*`. - -Our approach is to complete and extend this library and use it as the single -source of truth for ABI classification. One implementation in one place reduces -duplication, simplifies bug fixes, and creates a path for Classic Clang CodeGen -to use the same logic in the future. MLIR dialects (CIR, FIR, and others) will -use the library via an adapter layer rather than reimplementing ABI logic. - -**Current state.** The x86_64 implementation is largely complete and under -review. AArch64 and some other targets are not yet implemented; there is no -MLIR integration today. The work is being upstreamed in smaller parts (e.g. -[PR 158329](https://github.com/llvm/llvm-project/pull/158329)); progress is -limited by reviewer bandwidth. The overhead of the shadow type system -(converting to and from `abi::Type*`) has been measured at under 0.1% for clang --O0, so it is negligible for CIR. Our approach therefore depends on the ABI -library being merged upstream or our contributions to it being accepted. - -**Our approach.** The approach is to complete and extend the ABI library (e.g. -AArch64, review feedback, tests) and add an **MLIR integration layer** so that -MLIR dialects can use it: - -- **ABITypeMapper**: maps `mlir::Type` to `abi::Type*`, analogous to - QualTypeMapper for Clang. - -- **MLIR ABI lowering pass**: uses the library's `ABIInfo` for classification, - then performs dialect-specific rewriting via `ABIRewriteContext` for CIR, FIR, - and other dialects. - -The CIR incubator serves as a **reference only** (e.g. for AArch64 algorithms). -We do not upstream the incubator's CIR-specific ABI implementation as the -long-term solution; we port useful algorithms into the ABI library where -appropriate. - -### 2.3 Requirements for MLIR Dialects - -CIR needs to lower C/C++ calling conventions correctly, with initial support for -x86_64 and AArch64 targets. It must handle structs, unions, and complex types, -as well as support instance methods and virtual calls. FIR's initial need is -**cdecl for calling C from Fortran** (C interop); that is in scope. -Fortran-specific ABI semantics (e.g. CHARACTER hidden length parameters, array -descriptors) are out of initial scope; full Fortran ABI lowering is a broader -goal. Both dialects share common requirements: strict target ABI compliance, -efficient lowering with minimal overhead, extensibility for adding new target -architectures, and comprehensive testability and validation capabilities. - -## 3. Proposed Solution - -**Core.** The LLVM ABI library in `llvm/lib/ABI/` performs ABI classification on -`abi::Type*`. It provides `ABIInfo` and target-specific implementations -(x86_64, BPF, and eventually AArch64 and others). This is the single place -where ABI rules are implemented. - -**MLIR side.** To use this library from MLIR dialects we add an integration -layer: (1) **ABITypeMapper** maps `mlir::Type` to `abi::Type*` (analogous to -QualTypeMapper for Clang). (2) A **generic ABI lowering pass** invokes the -library's `ABIInfo` for classification, then (3) performs **dialect-specific -rewriting** via the `ABIRewriteContext` interface—each dialect (CIR, FIR, etc.) -implements only the glue to create its own operations (e.g. `cir.call`, -`fir.call`). Classification logic is shared; operation creation is -dialect-specific. - -The following diagram shows the layering. At the top, the ABI library holds -the ABI logic. In the middle, adapters connect frontends to it: Classic Clang -CodeGen uses QualTypeMapper; MLIR uses ABITypeMapper and the ABI lowering pass. -At the bottom, each dialect implements `ABIRewriteContext` only; FIR is shown as -a consumer for cdecl/C interop (e.g. calling C from Fortran). - -``` -┌─────────────────────────────────────────────────────────────────┐ -│ LLVM ABI Library (llvm/lib/ABI/) │ -│ ABIInfo, abi::Type*, target implementations (X86, AArch64,…) │ -└─────────────────────────────────────────────────────────────────┘ - │ - ┌─────────────────┴─────────────────┐ - │ │ - ▼ ▼ -┌───────────────────────┐ ┌───────────────────────────────┐ -│ Classic CodeGen │ │ MLIR adapter │ -│ QualTypeMapper │ │ ABITypeMapper + ABI pass │ -└───────────────────────┘ └───────────────────────────────┘ - │ - ┌────────────────┼────────────────┐ - │ │ │ - ▼ ▼ ▼ - ┌────────────┐ ┌────────────┐ ┌────────────┐ - │ CIR │ │ FIR │ │ Future │ - │ ABIRewrite │ │ (cdecl/C │ │ Dialects │ - │ Context │ │ interop) │ │ │ - └────────────┘ └────────────┘ └────────────┘ -``` - -## 4. Design Overview - -### 4.1 Architecture Diagram - -The following diagram shows how the design builds on the ABI library (Section -3). At the top, the ABI library holds the classification logic. The middle -layer adapts MLIR to the ABI library: ABITypeMapper converts `mlir::Type` to -`abi::Type*`, and the MLIR ABI lowering pass invokes the library's `ABIInfo` and -uses the classification -to drive rewriting. At the bottom, each dialect implements only -`ABIRewriteContext` for operation creation; there is no separate type -abstraction layer in MLIR for classification—that lives in the ABI library. - -``` -┌─────────────────────────────────────────────────────────────────────────┐ -│ LLVM ABI Library (llvm/lib/ABI/) — single source of truth │ -│ abi::Type*, ABIInfo, target implementations (X86_64, AArch64, …) │ -│ Input: abi::Type* → Output: classification (ABIArgInfo, etc.) │ -└─────────────────────────────────────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────────────┐ -│ MLIR adapter │ -│ ABITypeMapper (mlir::Type → abi::Type*) + MLIR ABI lowering pass │ -│ (1) Map types (2) Call ABIInfo (3) Drive rewriting from │ -│ classification result │ -└─────────────────────────────────────────────────────────────────────────┘ - │ - ┌─────────────────┼─────────────────┐ - ▼ ▼ ▼ - ┌────────────┐ ┌────────────┐ ┌────────────┐ - │ CIR │ │ FIR │ │ Future │ - │ ABIRewrite │ │ ABIRewrite │ │ Dialects │ - │ Context │ │ Context │ │ │ - └────────────┘ └────────────┘ └────────────┘ - Dialect-specific operation creation only (no type - abstraction for classification in MLIR) -``` - -### 4.2 ABI Library, Adapter, and Dialect Layers - -The architecture has three parts. **The ABI library** (`llvm/lib/ABI/`) is the -single source of truth for ABI classification: it operates on `abi::Type*` and -produces classification results (e.g. ABIArgInfo, ABIFunctionInfo). -Target-specific `ABIInfo` implementations (X86_64, AArch64, etc.) live there. -The **adapter layer** is MLIR-specific: ABITypeMapper maps `mlir::Type` to -`abi::Type*`, and the MLIR ABI lowering pass (1) maps types, (2) calls the -library's ABIInfo, and (3) uses the classification to drive rewriting. The -**dialect layer** is only ABIRewriteContext: each dialect (CIR, FIR) implements -operation creation (createFunction, createCall, createExtractValue, etc.). -There is no type abstraction layer in MLIR for classification; type queries for -ABI are performed on `abi::Type*` inside the ABI library. - -### 4.3 Key Components - -The framework is built from the following components. **The ABI library** -(`llvm/lib/ABI/`) provides the single source of truth for ABI classification: -the `abi::Type*` type system, the `ABIInfo` base and target-specific -implementations (e.g. X86_64, AArch64), and the classification result types -(e.g. ABIArgInfo, ABIFunctionInfo). **ABITypeMapper** maps `mlir::Type` to -`abi::Type*` so that MLIR dialect types can be classified by the ABI library. -The generic mapper relies on existing MLIR type interfaces (e.g. -`DataLayoutTypeInterface`) for size and alignment, and pattern-matches on -standard type categories (integers, floats, pointers, structs, arrays, -vectors) to build `abi::Type*`. Dialects whose types do not conform to -standard MLIR type categories (e.g. CIR's `cir::IntType` is not -`mlir::IntegerType`) may need dialect-aware mapping alongside the generic -mapper to preserve semantics such as signedness, pointer identity, and -record field structure. -The **MLIR ABI lowering pass** orchestrates the flow: it uses ABITypeMapper, -calls the library's ABIInfo, and drives rewriting from the classification -result. **ABIRewriteContext** is the dialect-specific interface for operation -creation (each dialect implements it to produce e.g. cir.call, fir.call). A -**target registry** (or equivalent) is used to select the appropriate ABIInfo -for the compilation target. There is no ABITypeInterface or separate "ABIInfo -in MLIR"; classification lives entirely in the ABI library. - -### 4.4 ABI Lowering Flow: How the Pieces Fit Together - -This section describes the end-to-end flow of ABI lowering, showing how all -interfaces and components work together. - -#### Step 1: Function Signature Analysis - -The ABI lowering pass begins by analyzing the function signature. Function -operations are identified via MLIR's `FunctionOpInterface`, which provides -access to the function type, argument types, and return types. The pass -extracts the parameter types and return type to prepare them for -classification. At this stage, the types are still in their -high-level, dialect-specific form (e.g., `!cir.struct` for CIR, or `!fir.type` -for FIR). The pass collects these types into a list that will be fed to the -classification logic in the next step. - -``` -Input: func @foo(%arg0: !cir.int<u, 32>, - %arg1: !cir.struct<{!cir.int<u, 64>, - !cir.int<u, 64>}>) -> !cir.int<u, 32> -``` - -#### Step 2: Type Mapping via ABITypeMapper - -For each argument and the return type, the pass maps `mlir::Type` to -`abi::Type*` using ABITypeMapper. The mapper produces the representation that -the library's ABIInfo expects; optionally, it can map back to MLIR types for coercion -types when needed. - -```cpp -// Map dialect types to the library's type system -ABITypeMapper abiTypeMapper(module.getDataLayout()); -abi::Type *arg0Abi = abiTypeMapper.map(arg0Type); // i32 -> IntegerType -abi::Type *arg1Abi = abiTypeMapper.map(arg1Type); // struct -> RecordType -abi::Type *retAbi = abiTypeMapper.map(returnType); -``` - -**Key Point**: Classification runs in the ABI library on `abi::Type*`; ABITypeMapper is -the only bridge from dialect types to that representation. - -#### Step 3: ABI Classification - -The library's target-specific `ABIInfo` (e.g. X86_64) performs classification on -`abi::Type*` and produces the library's classification result (e.g. ABIFunctionInfo -and ABIArgInfo as defined in `llvm/lib/ABI/`): - -```cpp -// The MLIR ABI lowering pass obtains the ABIInfo from the target -// registry based on the module's target triple (see Section 5.2). -llvm::abi::ABIInfo *abiInfo = getABIInfo(); // e.g. X86_64 -llvm::abi::ABIFunctionInfo abiFI; -abiInfo->computeInfo(abiFI, arg0Abi, arg1Abi, retAbi); -// For struct<i64,i64> on x86_64: produces Expand (two i64 args) -``` - -Output: the library's classification (e.g. ABIFunctionInfo) for all arguments and -return: -- `%arg0 (i32)` → Direct (pass as-is) -- `%arg1 (struct)` → Expand (split into two i64 fields) -- Return type → Direct - -#### Step 4: Function Signature Rewriting - -After the library's classification is complete, the pass rewrites the function to match -the ABI requirements using the dialect's `ABIRewriteContext`. The -classification result (from the ABI library) describes the lowered signature; the rewrite -context creates the actual dialect operations. For example, if a struct is -classified as "Expand", the new function signature will have multiple scalar -parameters instead of the single struct parameter. - -```cpp -ABIRewriteContext &ctx = getDialectRewriteContext(); - -// Create new function with lowered signature -FunctionType newType = ...; // (i32, i64, i64) -> i32 -Operation *newFunc = ctx.createFunction(loc, "foo", newType); -``` - -**Key Point**: The original function had signature `(i32, struct) -> i32`, but -the ABI-lowered function has signature `(i32, i64, i64) -> i32` with the struct -expanded into its constituent fields. - -#### Step 5: Argument Expansion - -With the function signature rewritten, the pass updates all call sites to match -the new signature, using the classification from the ABI library to drive rewriting via -`ABIRewriteContext`. For arguments classified as "Expand", the pass breaks down -the aggregate into its constituent parts (e.g. struct into two i64 values). -The rewrite context provides operations to extract fields and construct the new -call with the expanded argument list. - -```cpp -// Original call: call @foo(%val0, %structVal) -// Need to extract struct fields: - -Value field0 = ctx.createExtractValue(loc, structVal, {0}); // extract 1st i64 -Value field1 = ctx.createExtractValue(loc, structVal, {1}); // extract 2nd i64 - -// New call with expanded arguments -ctx.createCall(loc, newFunc, {resultType}, {val0, field0, field1}); -``` - -**Key Point**: `ABIRewriteContext` abstracts the dialect-specific operation -creation, so the lowering logic doesn't need to know about CIR operations. - -#### Step 6: Return Value Handling - -For functions returning large structs (indirect return): - -```cpp -// If return type is classified as Indirect: -Value sretPtr = ctx.createAlloca(loc, retType, alignment); -ctx.createCall(loc, func, {}, {sretPtr, ...otherArgs}); -Value result = ctx.createLoad(loc, sretPtr); -``` - -#### Complete Flow Diagram - -The diagram below combines the three-layer architecture (Section -4.1) with the step-by-step flow, showing which layer owns each -step. - -``` - ┌─────────────────────────────────────────────────────────┐ - │ Input: High-Level Function (CIR/FIR/other dialect) │ - │ func @foo(%arg0: i32, %arg1: struct<i64,i64>) -> i32 │ - └──────────────────────────┬──────────────────────────────┘ - │ - ╔═══════════════════════════╪═══════════════════════════════╗ - ║ MLIR Adapter Layer │ ║ - ║ ▼ ... [truncated] `````````` </details> https://github.com/llvm/llvm-project/pull/192066 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
