On Jun 22, 7:10 pm, Mads Kiilerich <m...@kiilerich.com> wrote: > Wouldn't the bulk word processing (sic!) that really matters > perform even better if it didn't have to consider masks because leading > and trailing bytes had been handled byte by byte?
That is already done. There is separate code for the leading, bulk, and trailing portions of the buffer, and the code for the bulk portion works without masks if the source and destination are aligned with each other. At issue is the code for the leading and trailing portions, which currently accesses a word of memory and could be changed to access individual bytes. Since this code only handles the ends of the buffer, I think the performance loss would be minimal unless the program calls RC4_Encrypt many times on small unaligned buffers. But the only way to really know who is right is to test it. In bug 451754 comment #4, Wan-Teh Chung wondered what memcpy does. The glibc implementation for i686 handles the ends of the buffer byte by byte: http://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/i386/i686/memcpy.S So I think that is probably the right thing to do, though I'm not offering to implement it. > By the way: I can imagine that the current approach can cause real > problems if the memory next to the buffers concurrently is modified from > other threads. Perhaps that should be mentioned too. Yes, that is a problem. I imagine it's rare that a program would use memory that way, but if it did, the user would be rightfully upset about NSS stomping on their memory. -- Matt -- dev-tech-crypto mailing list dev-tech-crypto@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-tech-crypto