This was addressed in https://usn.ubuntu.com/usn/usn-3548-1/ in Ubuntu 17.10 for the linux kernel, and in https://usn.ubuntu.com/usn/usn-3548-2/ for the linux-hwe and other backport kernels. Closing.
** Information type changed from Private Security to Public Security ** Changed in: linux (Ubuntu Artful) Status: New => Fix Released ** Changed in: linux-azure (Ubuntu Xenial) Status: New => Fix Released ** Changed in: linux-gcp (Ubuntu Xenial) Status: New => Fix Released -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1745290 Title: panic or arbitrary jump due to coding error in retpoline for system call entry Status in linux package in Ubuntu: Invalid Status in linux-azure package in Ubuntu: Invalid Status in linux-gcp package in Ubuntu: Invalid Status in linux-hwe package in Ubuntu: Invalid Status in linux-oem package in Ubuntu: Invalid Status in linux source package in Xenial: Invalid Status in linux-azure source package in Xenial: Fix Released Status in linux-gcp source package in Xenial: Fix Released Status in linux-hwe source package in Xenial: Fix Released Status in linux-oem source package in Xenial: Fix Released Status in linux source package in Artful: Fix Released Status in linux-azure source package in Artful: Invalid Status in linux-gcp source package in Artful: Invalid Status in linux-hwe source package in Artful: Invalid Status in linux-oem source package in Artful: Invalid Bug description: Calling an invalid system call number panics the system (or potentially calls anywhere in kernel memory) In the system call entry, there's an andl, cmpl sequence to test the system call number in %rax/%eax for validity. If that fails, it will "ja 1f", which is the entry into the retpoline logic. That logic will eventually "retpoline" jump to what's in %r10 prior to this code fragment. The value in %r10 is the 4th system call argument passed directly in from user space, e.g., rv = syscall(0x270f, 0x1111, 0x2222, 0x3333, 0x4444); so the kernel will attempt to jump to whatever is in the "0x4444" position above. The above call results in a panic and: [ 102.983486] BUG: unable to handle kernel paging request at 0000000000004444 [...] [ 103.165771] Code: Bad RIP value. [ 103.169205] RIP: 0x4444 RSP: ffffac8780be3f50 The issue appears in the artful (4.13) and derivative kernels, because there is a coding error in the assembly language arch/x86/entry/entry_64.S: entry_SYSCALL_64_fastpath: /* * Easy case: enable interrupts and issue the syscall. If the syscall * needs pt_regs, we'll call a stub that disables interrupts again * and jumps to the slow path. */ TRACE_IRQS_ON ENABLE_INTERRUPTS(CLBR_NONE) #if __SYSCALL_MASK == ~0 cmpq $__NR_syscall_max, %rax #else andl $__SYSCALL_MASK, %eax cmpl $__NR_syscall_max, %eax #endif ja 1f /* return -ENOSYS (already in pt _regs->ax) */ movq %r10, %rcx /* * This call instruction is handled specially in stub_ptregs_64. * It might end up jumping to the slow path. If it jumps, RAX * and all argument registers are clobbered. */ movq sys_call_table(, %rax, 8), %r10 jmp 1f 4: callq 2f 3: nop jmp 3b 2: mov %r10, (%rsp) retq 1: callq 4b .Lentry_SYSCALL_64_after_fastpath_call: movq %rax, RAX(%rsp) 1: Note the "ja 1f"; it's meant to jump to the later "1:" label, but disassembly of the compiled object shows that it goes to the "1: callq 4b" instead (the retpoline logic). The xenial code for this section looks like: jmp 1001f 1004: callq 1002f 1003: nop jmp 1003b 1002: mov %r10, (%rsp) retq 1001: callq 1004b and is not subject to the panic. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1745290/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp