https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81201
Bug ID: 81201 Summary: The final asm code doesn't check if a function changes the value of ebx, resulting in segmentation fault. Product: gcc Version: 6.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: arget at autistici dot org Target Milestone: --- Created attachment 41628 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41628&action=edit GCC preprocessed code of the source code described above on raspbian. Hi, compiling with gcc version 6.3.0: arget@plata:~$ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 6.3.0-18' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 6.3.0 20170516 (Debian 6.3.0-18) arget@plata:~$ uname -a Linux plata 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u1 (2017-06-18) x86_64 GNU/Linux There is problem with the following code. It's just my own implementation of the ChaCha20 stream cipher like CSPRNG (Cryptographically Secure Pseudo-Random Number Generator, something like random() but safer): #include <stdio.h> #include <stdint.h> #include <string.h> #define NONCE "\x00\x00\x00\x09\x00\x00\x00\x4a\x00\x00\x00\x00" #define RtoL(x, n) \ ((x << n) | (x >> (32 - n))) #define QR(a, b, c, d) \ estado[a] += estado[b]; estado[d] ^= estado[a]; estado[d] = RtoL(estado[d], 16); \ estado[c] += estado[d]; estado[b] ^= estado[c]; estado[b] = RtoL(estado[b], 12); \ estado[a] += estado[b]; estado[d] ^= estado[a]; estado[d] = RtoL(estado[d], 8); \ estado[c] += estado[d]; estado[b] ^= estado[c]; estado[b] = RtoL(estado[b], 7); // "expand 32-byte k" static const uint32_t chachaConst[4] = {0x61707865, 0x3320646e, 0x79622d32, 0x6b206574}; static uint32_t chachaKey[8], chachaCount = 0, chachaNonce[3]; static uint8_t quedanPorLeer = 0, chachaRandomOutput[64]; static void chacha() { uint32_t estado[16]; uint32_t i; memcpy(estado, chachaConst, 16); memcpy(&estado[4], chachaKey, 64); chachaCount++; estado[12] = chachaCount; memcpy(&estado[13], chachaNonce, 12); memcpy(chachaRandomOutput, estado, 64); for(i = 0; i < 10; i++) { QR(0, 4, 8, 12) QR(1, 5, 9, 13) QR(2, 6, 10, 14) QR(3, 7, 11, 15) QR(0, 5, 10, 15) QR(1, 6, 11, 12) QR(2, 7, 8, 13) QR(3, 4, 9, 14) } uint32_t *q = (uint32_t*)chachaRandomOutput; for(i = 0; i < 64; i++) q[i] += estado[i]; } void chachaSeed(const uint8_t s[32]) { memcpy(chachaKey, s, 32); memcpy(chachaNonce, NONCE, 12); chachaCount = 0; quedanPorLeer = 0; } uint8_t chachaGet() { if(!quedanPorLeer) { chacha(); quedanPorLeer = 64; } return chachaRandomOutput[64 - (quedanPorLeer--)]; } int main() { chachaSeed((const uint8_t*)"\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f"); int i; for(i = 0; i < 64; i++) printf("%c", chachaGet()); } The ChaCha20-CSPRNG works similar to random() and srandom(), with chachaSeed() you can set a 32 byte seed and chachaGet() will return the random values in several calls to this function. The values of the nonce and seed are the inputs used in the test vector of the RFC-7539 (https://tools.ietf.org/html/rfc7539#section-2.3.2). The program works great compiled for 64 bits: arget@plata:~$ gcc a.c -o a arget@plata:~$ file a a: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=a11bd03a3633d945c5ef7f20baa85b313f65dac6, not stripped arget@plata:~$ ./a | xxd 00000000: 10f1 e7e4 d13b 5915 500f dd1f a320 71c4 .....;Y.P.... q. 00000010: c7d1 f4c7 33c0 6803 0422 aa9a c3d4 6c4e ....3.h.."....lN 00000020: d282 6446 079f aa09 14c2 d705 d98b 02a2 ..dF............ 00000030: b512 9cd1 de16 4eb9 cbd0 83e8 a250 3c4e ......N......P<N But compiled for 32 bits it breaks in sigsegv: arget@plata:~$ gcc a.c -o a -m32 arget@plata:~$ ./a Violación de segmento [Sorry for my system in spanish] I have tried the -Wall -Wextra and -fno-strict-aliasing -fwrapv, but gcc doesn't find nothing wrong in my code, and the problem persists... With gcc I found the program breaks in the function chachaGet(): 00000b8f <chachaGet>: b8f: 55 push ebp b90: 89 e5 mov ebp,esp b92: 53 push ebx b93: 83 ec 04 sub esp,0x4 b96: e8 05 f9 ff ff call 4a0 <__x86.get_pc_thunk.bx> b9b: 81 c3 65 14 00 00 add ebx,0x1465 ba1: 0f b6 83 28 00 00 00 movzx eax,BYTE PTR [ebx+0x28] ba8: 84 c0 test al,al baa: 75 0c jne bb8 <chachaGet+0x29> bac: c6 83 28 00 00 00 40 mov BYTE PTR [ebx+0x28],0x40 bb3: e8 18 fa ff ff call 5d0 <chacha> bb8: 0f b6 83 28 00 00 00 movzx eax,BYTE PTR [ebx+0x28] bbf: 8d 50 ff lea edx,[eax-0x1] bc2: 88 93 28 00 00 00 mov BYTE PTR [ebx+0x28],dl bc8: 0f b6 c0 movzx eax,al bcb: ba 40 00 00 00 mov edx,0x40 bd0: 29 c2 sub edx,eax bd2: 8d 83 80 00 00 00 lea eax,[ebx+0x80] bd8: 0f b6 04 10 movzx eax,BYTE PTR [eax+edx*1] bdc: 83 c4 04 add esp,0x4 bdf: 5b pop ebx be0: 5d pop ebp be1: c3 ret The program breaks in 0xbb8, "movzx eax,BYTE PTR [ebx+0x28]". In this instruction the code "acts" as if the ebx register wouldn't have changed since it executes __x86.get_pc_thunk.bx, but the call to chacha has changed it, resulting in ebx = 0. I fixed it saving the ebx in the stack before the call to chacha and restoring it to ebx from the stack after chacha: uint8_t chachaGet() { if(!quedanPorLeer) { __asm__("push %ebp"); chacha(); __asm__("pop %ebp"); quedanPorLeer = 64; } return chachaRandomOutput[64 - (quedanPorLeer--)]; } Now the code works: arget@plata:~$ gcc a.c -o a -m32 arget@plata:~$ ./a | xxd 00000000: 10f1 e7e4 d13b 5915 500f dd1f a320 71c4 .....;Y.P.... q. 00000010: c7d1 f4c7 33c0 6803 0422 aa9a c3d4 6c4e ....3.h.."....lN 00000020: d282 6446 079f aa09 14c2 d705 d98b 02a2 ..dF............ 00000030: b512 9cd1 de16 4eb9 cbd0 83e8 a250 3c4e ......N......P<N arget@plata:~$ But... as you can understand, this isn't a real solution, especially because it isn't portable. The problem also occurs compiling for arm on a raspbian: pi@raspberrypi:~ $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/4.9/lto-wrapper Target: arm-linux-gnueabihf Configured with: ../src/configure -v --with-pkgversion='Raspbian 4.9.2-10' --with-bugurl=file:///usr/share/doc/gcc-4.9/README.Bugs --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.9 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.9 --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-gnu-unique-object --disable-libitm --disable-libquadmath --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.9-armhf/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.9-armhf --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.9-armhf --with-arch-directory=arm --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-sjlj-exceptions --with-arch=armv6 --with-fpu=vfp --with-float=hard --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf Thread model: posix gcc version 4.9.2 (Raspbian 4.9.2-10) pi@raspberrypi:~ $ uname -a Linux raspberrypi 4.9.28+ #998 Mon May 15 16:50:35 BST 2017 armv6l GNU/Linux pi@raspberrypi:~ $ gcc a.c -o a pi@raspberrypi:~ $ ./a Segmentation fault pi@raspberrypi:~ $ Since I don't know ARM assembly enough I can't do same analysis as for x86. In x64 there isn't any error because the position of "quedanPorLeer" is determined as an offset of the rip register. As you ask, I send you the *.i file of the compiling for arm on rasbian.