https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113855
--- Comment #4 from Iain Sandoe <iains at gcc dot gnu.org> --- Created attachment 57378 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57378&action=edit patch under test This implements the common case for an i386 trampoline (and, in this respect, matches the expectations for x86_64 and aarch64). ---------- to explain the "common case" comment. We still have work to do to generalise the heap trampoline implementation; Several platforms have multiple trampoline implementations that can depend on optimisations or protections. At present, I considered: (1) - having the heap trampoline impl just deal with managing the storage. -- one would call a builtin to obtain a writeable trampoline area -- one would then use the inline existing trampoline impl to populate that -- one would then call to the heap management to apply conditions to make the written trampoline executable (since most security models do not allow writable+executable at the same time). -- although this model re-uses the existing trampoline in-line code-gen, I'm not in favour of it (a) it means two calls to libgcc_s for each case + (b) the inline code-gen is replicated which appears pointless code bloat since we have failed to avoid an out of line function call already. (2) - having the caller of __gcc_heap_trampoline_created () pass sufficient information for that function to alter the flavour of trampoline as needed. ( i) - we could add an extra argument to communicate this information (it could be as simple as an enum, probably). (ii) - We could re-use the third argument (which is a pointer to a pointer) to pass in a pointer to the data (or nullptr, for 'default' perhaps) -- in either case we need a new target hook that returns the relevant enum (or other target-private data) to be passed to __gcc_heap_trampoline_created(). (3) - perhaps using the existing in-line codegen into a scratch space and passing that to __gcc_heap_trampoline_created () -- not keen on this because: (a) the same comment about in-line code + out of line call. (b) if the trampoline uses relative branches (like the i386 common case) then those need to be relocated, which then means that the __gcc_heap_trampoline_created routine would have to match the relevant patterns and do the relocation. so, at the moment, (2) is my favoured approach - but more thinking might being other ideas.