On Fri, Aug 22, 2025 at 03:11:16PM +0000, Qing Zhao wrote: > > On Aug 21, 2025, at 17:29, Kees Cook <k...@kernel.org> wrote: > > For non-static functions, we cannot know if other compilation units may > > make indirect calls to a given function, so those functions must always > > have their kcfi preamble added. For static functions, if they are > > address-taken by the current compilation unit, then they must get a kcfi > > preamble added. > > Oh, yeah, I see. without lto or whole-program-mode, we cannot determine > whether a extern function is address taken or not. Therefore, we have to > treat ALL extern functions conservatively as address taken. > > So, from my understanding, the complete list that need to compute the typeid > from the function prototype is: > > - At indirect call sites > - all indirect call sites; (At the call site) > - At function preambles > - all address-taken static functions (At the function definition) > - all extern functions (At function declaration or function > definition?? Please see my question below)
For "extern functions", the logic is split as: - "all extern function definitions get preamble" - "all extern function declarations without a definition that are address-taken get __kcfi_typeid_ symbol" > > The other case is emitting the __ckfi_typeid_FUNC weak symbols, which is > > used for link-time resolution with non-C code (i.e. raw .S assembly) > > which doesn't have access to the C type system to calculate the hashes > > on its own, and needs to have a way to build its own kcfi preambles. > > So, for such functions, there should be an extern function declaration in the > C code. > But the definition of such function is not available in the C code we are > compiling. > Therefore the weak __ckfi_typeid_FUNC symbol is emitted at the function > declaration > point for such function when we compile the C code? > > And the typeid (the hash value) for such routine is computed at the function > declaration > point too. > > Is the above understanding correct? Correct, the kcfi_typeid symbol and value are emitted at function declaration point, but only if such function is address-taken. > Then for the other extern function whose definition is in the C code of other > modules that might > be compiled later, should the typeid is computed at the declaration or the > definition? It is computed and emitted just for externs that are address-taken. > > This > > is how Linux constructs its assembly function entry points: > > > > #ifndef __CFI_TYPE > > #define __CFI_TYPE(name) \ > > .4byte __kcfi_typeid_##name > > #endif > > > > #define SYM_TYPED_ENTRY(name, linkage, align...) \ > > linkage(name) ASM_NL \ > > align ASM_NL \ > > __CFI_TYPE(name) ASM_NL \ > > name: > > > > That way all the asm functions can be be indirect call targets without > > knowing the hash value (which will be filled in at link time). > > Okay. I see. This is the case for the extern function whose definition is in > the assembly file. (Not available in > the C code) Right, and sometimes we have to explicitly perform a no-op address-taking to make sure a symbol gets generated: /* * Force the compiler to emit 'sym' as a symbol, so that we can reference * it from inline assembler. Necessary in case 'sym' could be inlined * otherwise, or eliminated entirely due to lack of references that are * visible to the compiler. */ #define ___ADDRESSABLE(sym, __attrs) \ static void * __used __attrs \ __UNIQUE_ID(__PASTE(__addressable_,sym)) = (void *)(uintptr_t)&sym; #define __ADDRESSABLE(sym) \ ___ADDRESSABLE(sym, __section(".discard.addressable")) $ git grep KCFI_REFERENCE include/linux/compiler.h:#define KCFI_REFERENCE(sym) __ADDRESSABLE(sym) arch/x86/include/asm/page_64.h:KCFI_REFERENCE(copy_page); arch/x86/include/asm/string_64.h:KCFI_REFERENCE(__memset); arch/x86/include/asm/string_64.h:KCFI_REFERENCE(__memmove); arch/x86/kernel/alternative.c:KCFI_REFERENCE(__bpf_prog_runX); arch/x86/kernel/alternative.c:KCFI_REFERENCE(__bpf_callback_fn); > > I assume I just didn't see how yet. :) I wasn't able to identify nor > > store the typeid for function definitions that ultimately end up getting > > .s file output. > So, the problem only exists for the external functions whose definition is > NOT in the C code? Yup! > > I couldn't figure out how to find these during the GIMPLE pass. Oh, > > perhaps I can do this with an IPA pass? That should let me walk all > > functions including externs. I'll give it a try... Adding the IPA pass to find all functions worked perfectly. I was able to remove all the weird DECL reconstruction and just use the original FUNCTION_TYPE info for the typeids. -Kees -- Kees Cook