Mike Belopuhov wrote this message on Wed, Nov 12, 2014 at 19:05 +0100: > On 10 October 2014 02:39, Damien Miller <d...@mindrot.org> wrote: > > On Thu, 9 Oct 2014, Christian Weisgerber wrote: > > > >> John-Mark Gurney: > >> > >> > I also have an implementation of ghash that does a 4 bit lookup table > >> > version with the table split between cache lines in p4 at: > >> > https://p4db.freebsd.org/fileViewer.cgi?FSPC=//depot/projects/opencrypto/sys/opencrypto/gfmult.c&REV=4 > >> > > >> > This also has a version with does 4 blocks at a time getting a > >> > further speed up... > >> > >> FWIW, I did a quick & dirty merge of this into the OpenBSD tree and > >> the speed of my test (net6501-50, tcpbench -u over esp aes-128-gmac) > >> almost doubled. > > > > isn't this likely to make it more likely to be subject to timing > > attacks? > > then how is this different to our table based aes implementation?
My gfmul code spreads the table out over 4 64-byte arrays (common cache line size), and to read an entry, it must access all four... This doesn't mitigate entirely the cache timing attacks due to the fact that the first 64-bits of a cache line are faster: https://bugzilla.mozilla.org/show_bug.cgi?id=868948#c5 > and it's the same C code as in openssl which also uses table based > gcm implementation. > > what countermeasures can be applied to the table lookup code > to fight these attacks? If you mean for AES, there is a version of AES that uses SSE instructions to bit slice the AES S-box and caclulate the S-box as if it were a set of logic gates... See: Faster and Timing-Attack Resistant AES-GCM by Emilia Kasper and Peter Schwabe... But obviously that requires FPU context saving... One of the issues today is that most of the research for implementations of algorithms assume you have access to SSE operations, there isn't much research going into making safe implementations that aren't using SSE... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."