https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121208
Bug ID: 121208 Summary: Wrong user-level interrupt vector value with TLS variable when build with optimisation Product: gcc Version: 15.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: charles.goedefroit at eviden dot com Target Milestone: --- Target: x86_64-linux-gnu Build: /usr/src/gcc/configure --build=x86_64-linux-gnu --disable-multilib --enable-languages=c,c++,fortran,go The issue affecte GCC version 11 12 13 14 and 15. ## Some context User-level interrupts (UINTR) is a new hardware feature introduced by Intel since Sapphire Rapids processors. That allows to register an interrupt handler in user-space to bypass the system (OS bypass). A thread that will want to receive interrupts needs to: - register an interrupt handler (`ui_handler` in our code) with the syscall `uintr_register_handler(ui_handler, flags)` (`syscall(471, ui_handler, 0)`). - ask a file descriptor associate to a user-level interrupt vector (UVEC) with the syscall `uvec_fd = uintr_vector_fd(UVEC, 0)` (`uvec_fd = syscall(473, 6, 0)`). - unmask user-level interrupt with STUI instruction (`_stui()`) - Share the `uvec_fd` file descriptor with all sender thread. A thread that wants to send user-level interrupt must be registered as sender with `uipi_index = uintr_register_sender(uvec_fd, flags)` (`uipi_index = syscall(474, uvec_fd, 0)`) and can use the SENDUIPI instruction (`_senduipi(uipi_index)`) to trigger an interrupt. A user-level interrupt handler (`ui_handler`) is called with parameters on the stack. The last parameter is the user-level interrupt vector (uvec). We use thread-local storage (TLS) variable. We create a shared library to manage UINTR and we create a small reproducer in `intrHandler.c`. Build command: ```bash # build libintrHandler_opt.so gcc -Wall -Wextra -DNDEBUG -muintr -g -O3 -fPIC -c -save-temps -o intrHandler_opt.pic.o intrHandler.c gcc intrHandler_opt.pic.o -shared -o libintrHandler_opt.so # build ./uintr2Threads_opt gcc -L. -Wl,-rpath=. -Wall -Wextra -DNDEBUG -muintr -g -o uintr2Threads_opt uintr2Threads.c -lintrHandler_opt ``` ## Our issue. The `uvec` parameter of `ui_handler` interrupt handler is loaded in a caller-save register (`%rcx`), then is not saved before calling the `__tls_get_addr` function which causes an invalid value check. In `ui_handler` we want to set to 1 a global TLS variable to know when we are in interrupt context or not, and we set with TLS variable (`th_in_interrupt_handler`) to 0 before the `ui_handler` returns. Juste after set the `th_in_interrupt_handler` TLS variable to 1, we check the `uvec` value to distinguish between different vector to perform different actions. In our example we only check on `uvec` value (6) because it's enough to reproduce the bug. When we build in non optimized (`-O0`), everything works. We send the UVEC 6 then the `if` statement in the `ui_handler` branch on 6. When we build in optimized (`-O1`, `-O2` or `-O3`), the `uvec` value got an invalid value. We send the UVEC 6 then the `if` statement in the `ui_handler` branch on `else` with an invalide value. So we check the assembly code with `objdump -dS --disassemble=ui_handler libintrHandler_opt.so`. ```txt 117f: 48 83 ec 08 sub $0x8,%rsp 1183: 48 8b 4c 24 60 mov 0x60(%rsp),%rcx th_in_interrupt_handler = 1; 1188: fc cld 1189: 66 48 8d 3d 1f 2e 00 data16 lea 0x2e1f(%rip),%rdi # 3fb0 <th_in_interrupt_handler@Base> 1190: 00 1191: 66 66 48 e8 b7 fe ff data16 data16 rex.W call 1050 <__tls_get_addr@plt> 1198: ff 1199: c6 00 01 movb $0x1,(%rax) if(uvec == UVEC) { 119c: 48 83 f9 06 cmp $0x6,%rcx 11a0: 75 2d jne 11cf <ui_handler+0x5f> ``` In the assembly, we see that the `uvec` are loaded in the `%rcx` register, then the libc `__tls_get_addr@plt` is called, finally the `if` check is done (`cmp $0x6,%rcx`). So we can see that the `RCX` register isn't saved before the libc `__tls_get_addr@plt` call. `RCX` is a caller-save register and must be saved before any function call. When during the `__tls_get_addr@plt` call the `%rcx` register changes and is not restored. To generate a valid code, I add the line at the at the beginning of the `ui_handler`. ```c __attribute__((target("general-regs-only"))) __attribute__((interrupt)) void ui_handler(__attribute__((unused)) struct __uintr_frame*ui_frame, uint64_t uvec) { asm volatile ("nop" : : : "%rcx"); th_in_interrupt_handler = 1; if(uvec == UVEC) { (*callback)(); } else { exit(uvec); } th_in_interrupt_handler = 0; } ``` So, `%r8` is used and the `uvec` value become valide when build in optimized. ```txt 117f: 48 83 ec 08 sub $0x8,%rsp 1183: 4c 8b 44 24 60 mov 0x60(%rsp),%r8 asm volatile ("nop" : : : "%rcx"); 1188: 90 nop th_in_interrupt_handler = 1; 1189: fc cld 118a: 66 48 8d 3d 1e 2e 00 data16 lea 0x2e1e(%rip),%rdi # 3fb0 <th_in_interrupt_handler@Base> 1191: 00 1192: 66 66 48 e8 b6 fe ff data16 data16 rex.W call 1050 <__tls_get_addr@plt> 1199: ff 119a: c6 00 01 movb $0x1,(%rax) if(uvec == UVEC) { 119d: 49 83 f8 06 cmp $0x6,%r8 11a1: 75 2d jne 11d0 <ui_handler+0x60> ``` `gcc -v` output: ```txt Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-linux-gnu/15.1.0/lto-wrapper Target: x86_64-linux-gnu Configured with: /usr/src/gcc/configure --build=x86_64-linux-gnu --disable-multilib --enable-languages=c,c++,fortran,go Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 15.1.0 (GCC) ```