sivadeilra wrote: > Oh so all this dance (`_ref_` and the additional metadata) is for code page > integrity purposes only? To keep them unmodified in memory? So how does then > the kernel use the PE metadata if it doesn't patch the code memory pages of > the initial (running) image? Is there an additional mechanism for trapping > and redirecting calls into the new image? If that's the case, there's no > really image patching involved at runtime? The "hot-patching" part is just a > _vue d'esprit_ to present the concept to the user?
Sorry, I may have unintentionally mislead. Let me clarify. Our workflow for generating patches is this: 1. A vulnerability is identified and the affected functions are identified. 2. Hot-patching requirements are checked (a combination of manual and automated checks). It cannot introduce new DLL imports/exports, cannot change function signatures of existing functions (unless they are entirely inlined), cannot add new fields to existing types, etc. 3. An "intermediate patch image" is created. This is a normal compilation of a complete binary, but with the flags added to the compiler that tell it to hot-patch certain functions. This step is the focus on this PR. The compiler and linker generate a complete executable image, but the hot-patched functions use `__ref_*` indirection and the PDB contains a description of the hot-patched functions. 4. We run our hot-patch generation tool, which compares the original image (called the "base") with the "intermediate" image. It automatically verifies many of our requirements; if those requirements fail, the developer must re-evaluate and go back to step 2. The most common cause is that inlining caused a function to be pulled into the hot-patch set. The output of this step is the "hot-patch metadata". 5. The tools modify a copy of the intermediate image and insert the "hot-patch metadata". This creates the "final image". The final image can either be loaded as a complete, standalone binary (and will be, when the system reboots) or can be used to hot-patch an active instance of the base image. The metadata describes code patches in _both_ directions, as well data patches in _one_ direction. The code patches modify the base image in memory, using a code update idiom that does not require stopping threads or CPUs. The code patches also modify the hot-patch image, so that function calls to non-patched code are modified to point back into the base image. This is done so that multiple hot-patches of the same binary can occur; each new hot-patch completely replaces the code from the previous hot-patch, although existing threads (or CPUs) may continue executing those paths until they return from them. The hot-patch metadata also describes how to set the initial value of the `__ref_*` variables to point into the variables in the base image. So I didn't mean to imply that the sole purpose of the `__ref_*` pointers was to avoid modifying the original image, in memory. The Windows kernel contains the code that interprets the hot-patch metadata, so the format and semantics of the hot-patch metadata are determined by that code. This PR is meant to enable Clang (and eventually Rust) to generate code that can work in this workflow. For reference, @dpaoliello and I are the authors of Rust code that executes in the Windows kernel. Our motivation for this work is to enable hot-patching of this code and of related non-Rust code in the same images. Aligning LLVM's codegen, in this situation, and providing the S_HOTPATCHFUNC symbol, are a necessary part of enabling this whole scenario and in continuing with Rust development within the Windows kernel. https://github.com/llvm/llvm-project/pull/138972 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits