On Fri, 26 Sept 2025 at 05:02, Kees Cook <[email protected]> wrote: > > diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md > index 422ae549b65b..c3b9f16ea872 100644 > --- a/gcc/config/arm/arm.md > +++ b/gcc/config/arm/arm.md ... > +/* Output the assembly for a KCFI checked call instruction. INSN is the > + RTL instruction being processed. OPERANDS is the array of RTL operands > + where operands[0] is the call target register, operands[2] is the KCFI > + type ID constant. Returns an empty string as all output is handled by > + direct assembly generation. */ > + > +const char * > +arm_output_kcfi_insn (rtx_insn *insn, rtx *operands) > +{ > + /* KCFI type id. */ > + uint32_t type_id = INTVAL (operands[2]); > + > + /* Calculate typeid offset from call target. */ > + HOST_WIDE_INT offset = -kcfi_typeid_offset; > + > + /* Generate custom label names. */ > + char trap_name[32]; > + char call_name[32]; > + ASM_GENERATE_INTERNAL_LABEL (trap_name, "Lkcfi_trap", kcfi_labelno); > + ASM_GENERATE_INTERNAL_LABEL (call_name, "Lkcfi_call", kcfi_labelno); > + > + /* Create memory operand for the type load. */ > + rtx mem_op = gen_rtx_MEM (SImode, > + gen_rtx_PLUS (SImode, operands[0], > + GEN_INT (offset))); > + rtx temp_operands[6]; > + > + /* Normally we can use r12 as our scratch register. */ > + unsigned scratch_reg_num = IP_REGNUM; > + /* If register pressure has made r12 our target register, we need to pick > + a different register. We don't want to spill our target register > + because on reload at the end of the KCFI check, we'd be producing > + the very kind of call gadget we were trying to protect against: > + "pop %target; call %target". In this case, use r3 as our scratch > + register. But since r3 may be used for function arguments, we need > + to check if it is being used for that and only spill/reload if that > + happens. Any spill/reload of r3 due to making a call will already > + have been managed by the register allocator, so we only have to care > + about not clobbering the argument value it may be carrying into the > + call here. Also use r3 when r12 is a fixed register. */ > + if (REGNO (operands[0]) == scratch_reg_num > + || fixed_regs[scratch_reg_num]) > + scratch_reg_num = LAST_ARG_REGNUM; > + rtx scratch_reg = gen_rtx_REG (SImode, scratch_reg_num); > + > + /* We only need to spill r3 if it's actually used by the call. */ > + bool need_spill = (scratch_reg_num == LAST_ARG_REGNUM) > + && reg_overlap_mentioned_p (scratch_reg, insn); > + > + /* Calculate trap immediate. */ > + unsigned addr_reg_num = REGNO (operands[0]); > + /* The scratch register is always clobbered by eor seq: use 0x1F. */ > + unsigned udf_immediate = 0x8000 | (0x1F << 5) | (addr_reg_num & 31); > +
I take it this means you still need to decode the instructions in the kernel to obtain the expected type id? Can't you insert the actual register index here, and defer the reload until after the UDF? That way, the scratch register will always contain the XOR of the actual vs expected typeids when taking the trap.
