On Fri, Aug 22, 2025 at 03:11:16PM +0000, Qing Zhao wrote:
> > On Aug 21, 2025, at 17:29, Kees Cook <k...@kernel.org> wrote:
> > For non-static functions, we cannot know if other compilation units may
> > make indirect calls to a given function, so those functions must always
> > have their kcfi preamble added. For static functions, if they are
> > address-taken by the current compilation unit, then they must get a kcfi
> > preamble added.
> 
> Oh, yeah, I see. without lto or whole-program-mode, we cannot determine 
> whether a extern function is address taken or not. Therefore, we have to 
> treat ALL extern functions conservatively as address taken. 
> 
> So, from my understanding, the complete list that need to compute the typeid 
> from the function prototype is:
> 
> - At indirect call sites
>       - all indirect call sites; (At the call site)
> - At function preambles
>       - all address-taken static functions  (At the function definition)
>       - all extern functions  (At function declaration or function 
> definition?? Please see my question below)

For "extern functions", the logic is split as:
 - "all extern function definitions get preamble"
 - "all extern function declarations without a definition that are
   address-taken get __kcfi_typeid_ symbol"

> > The other case is emitting the __ckfi_typeid_FUNC weak symbols, which is
> > used for link-time resolution with non-C code (i.e. raw .S assembly)
> > which doesn't have access to the C type system to calculate the hashes
> > on its own, and needs to have a way to build its own kcfi preambles.
> 
> So, for such functions, there should be an extern function declaration in the 
> C code. 
> But the definition of such function is not available in the C code we are 
> compiling. 
> Therefore the weak __ckfi_typeid_FUNC symbol is emitted at the function 
> declaration
> point for such function when we compile the C code? 
> 
> And the typeid (the hash value) for such routine is computed at the function 
> declaration 
> point too. 
> 
> Is the above understanding correct? 

Correct, the kcfi_typeid symbol and value are emitted at function
declaration point, but only if such function is address-taken.

> Then for the other extern function whose definition is in the C code of other 
> modules that might
> be compiled later, should the typeid is computed at the declaration or the 
> definition?

It is computed and emitted just for externs that are address-taken.

> > This
> > is how Linux constructs its assembly function entry points:
> > 
> > #ifndef __CFI_TYPE
> > #define __CFI_TYPE(name)                                \
> >        .4byte __kcfi_typeid_##name
> > #endif
> > 
> > #define SYM_TYPED_ENTRY(name, linkage, align...)        \
> >        linkage(name) ASM_NL                            \
> >        align ASM_NL                                    \
> >        __CFI_TYPE(name) ASM_NL                         \
> >        name:
> > 
> > That way all the asm functions can be be indirect call targets without
> > knowing the hash value (which will be filled in at link time).
> 
> Okay. I see.  This is the case for the extern function whose definition is in 
> the assembly file. (Not available in
> the C code)

Right, and sometimes we have to explicitly perform a no-op
address-taking to make sure a symbol gets generated:

/*
 * Force the compiler to emit 'sym' as a symbol, so that we can reference
 * it from inline assembler. Necessary in case 'sym' could be inlined
 * otherwise, or eliminated entirely due to lack of references that are
 * visible to the compiler.
 */
#define ___ADDRESSABLE(sym, __attrs)                                            
\
        static void * __used __attrs                                            
\
        __UNIQUE_ID(__PASTE(__addressable_,sym)) = (void *)(uintptr_t)&sym;

#define __ADDRESSABLE(sym) \
        ___ADDRESSABLE(sym, __section(".discard.addressable"))

$ git grep KCFI_REFERENCE
include/linux/compiler.h:#define KCFI_REFERENCE(sym) __ADDRESSABLE(sym)
arch/x86/include/asm/page_64.h:KCFI_REFERENCE(copy_page);
arch/x86/include/asm/string_64.h:KCFI_REFERENCE(__memset);
arch/x86/include/asm/string_64.h:KCFI_REFERENCE(__memmove);
arch/x86/kernel/alternative.c:KCFI_REFERENCE(__bpf_prog_runX);
arch/x86/kernel/alternative.c:KCFI_REFERENCE(__bpf_callback_fn);


> > I assume I just didn't see how yet. :) I wasn't able to identify nor
> > store the typeid for function definitions that ultimately end up getting
> > .s file output.
> So, the problem only exists for the external functions whose definition is 
> NOT in the C code? 

Yup!

> > I couldn't figure out how to find these during the GIMPLE pass. Oh,
> > perhaps I can do this with an IPA pass? That should let me walk all
> > functions including externs. I'll give it a try...

Adding the IPA pass to find all functions worked perfectly. I was able
to remove all the weird DECL reconstruction and just use the original
FUNCTION_TYPE info for the typeids.

-Kees

-- 
Kees Cook

Reply via email to