Hello Nikos, > > Can you tell me > > one program which spends more than 20% of its runtime in memxor? > > I had 10% speed-ups in a web server that used gnutls with the optimized > version of memxor. That is because CBC encryption mode uses XOR heavily. > 10% is enormous speed-up considering that this is a very small part of > the encryption process.
OK, that is sufficient rationale for having this faster memxor function in gnutls. The next question, also to Simon, is whether you want to have the faster memxor only in gnutls and leave the slower but simpler one in gnulib (used by the modules crypto/*hmac-* only). Or whether you want to have the faster one in gnulib. In the latter case, we need an implementation: > I meant the linked memxor implementation[0]. It is plain C code, and > XOR was being done per CPU word, not per byte. > > [0]. > http://cvs.lysator.liu.se/viewcvs/viewcvs.cgi/lsh/nettle/memxor.c?rev=1.4&root=lsh&view=auto If you can convince Niels Möller to assign the copyright of this code to the FSF, you can use this. Otherwise you can also take gnulib/lib/memcmp.c as starting point, eliminate the memory of Niels' code from your brain, and do the necessary modifications yourself (and we would like you to assign the copyright for this gnulib contribution, likewise). Additionally, such highly optimized requires a unit test. Here you are lucky and can take the union of gnulib/tests/test-memcmp.c and gnulib/tests/test-memchr.c as starting point. If you also want to have a memxor3 in gnulib, make it a separate module. It can share the lib/memxor.h file with the memxor module, though; no need to have two separate .h files for so tightly related functions. Bruno -- In memoriam Matthias Domaschk <http://de.wikipedia.org/wiki/Matthias_Domaschk>