> On Sat, Jan 13, 2018 at 8:51 AM, Woodhouse, David <d...@amazon.co.uk> wrote: > > On Sat, 2018-01-13 at 08:09 -0800, H.J. Lu wrote: > >> > >> > Again please extend both documentation hunks so it is clear what is > >> > purpose > >> > of this hack. > >> > >> David, can you help here? > > > > On most older CPUs the indirect branch issue is limited to actual > > indirect branches. > > > > On Skylake-era CPUs, however, an underflow of the RSB (return stack > > buffer) caused by a call/ret imbalance (such as on context switch) will > > cause predictions to come from the same problematic branch predictor — > > essentially, allowing 'ret' instructions to be targeted by an attacker > > in precisely the same way as indirect branches. > > > > Note that there are plenty of other causes for RSB underflow. Like > > taking an SMI, which clears the RSB completely. Or various other > > things. Including a call stack deeper than 16 function calls. > > > > The -mfunction-return option was an experiment to use the retpoline > > approach for 'ret' too. I forget the implementation (I could look > > upthread), but essentially it was equivalent to replacing ret with > > 'pop %r12; jmp __x86_indirect_thunk_r12' so that you *never* deplete > > the RSB because of the 'call;ret' trick in the retpoline itself. Hence > > your exposure on Skylake was reduced to the possibility of taking an > > SMI while *in* the retpoline. > > RCX/ECX is a scratch register for both 32-bit and 64-bit. Is it OK > to use it for "ret": > > pop %rcx > jmp __x86_indirect_thunk_rcx
Is it also safe for local functions and IPA-ra? Also what will your patchset do with large code model? Perhaps we want to sorry there that it is not suported. Honza > > > This would, of course, be forcing a mispredict/pipeline stall on every > > 'ret', rather than only on every indirect branch as in the original > > retpoline idea. HJ added the code, but I'm not sure anyone at Intel > > ever did actually do the *testing* to establish the performance > > characteristics. Dave/Arjan? > > > > For my part, right *now* the kernel doesn't use this option. But then, > > we don't have a comprehensive answer for Skylake yet other than "use > > the new microcode features". Which are slower than retpoline, but not > > as *much* slower on Skylake as they are on other CPUs. > > > > Amazon Web Services UK Limited. Registered in England and Wales with > > registration number 08650665 and which has its registered office at 60 > > Holborn Viaduct, London EC1A 2FD, United Kingdom. > > > > -- > H.J.